Page 1 of 2 12 LastLast
Results 1 to 10 of 12
  1. #1
    Newbie
    Join Date
    Nov 2020
    Posts
    12

    Question UVM Crashes every couple days (continued)

    Reposted to the right category (don't know how to delete the other one)
    Hello guys I am back with the same issue i had in this thread: https://forums.untangle.com/networki...ng-server.html

    Thought it got fixed but it started happening again. Reason im making a new thread is because this time around i got some disk errors as well (thread here: https://forums.untangle.com/networki...sk-errors.html) and I was wondering if these 2 issues together mean anything. I seriously do not know where to go from here.

    EDIT: Should also note regarding the first link, issue is worse now, not even having ethernet working anymore. Whole server crashes but the pc lights are still on.
    Last edited by bordyboy; 03-20-2021 at 11:53 AM.

  2. #2
    Untangle Ninja sky-knight's Avatar
    Join Date
    Apr 2008
    Location
    Phoenix, AZ
    Posts
    26,393

    Default

    Both of your links are broken.

    But I'm going to put out there that most of the time, disk errors are exactly that, and generally means defective disk. However, with Linux things get a bit twisty sometimes because defective RAM can manifest as a defective disk in many cases.

    Either way, it's hardware... and needs testing.

    I've seen the other thread regarding disk, and I caution you to not do what the other poster did.

    You do have this: https://wiki.debian.org/SSDOptimization

    And yes, there's a bit in there to enable automatic trim, and while that's true I've NEVER had to do that. And I've sold and supported Untangle with nothing but SSDs equipped for over half a decade now. TRIM is done by the drive's firmware automatically, and if it's not... the drive is CRAP, get a new one. This is a SERVER, treat it like one.
    Last edited by sky-knight; 03-20-2021 at 07:24 AM.
    Rob Sandling, BS:SWE, MCP
    NexgenAppliances.com
    Phone: 866-794-8879 x201
    Email: support@nexgenappliances.com

  3. #3
    Newbie
    Join Date
    Nov 2020
    Posts
    12

    Default

    Quote Originally Posted by sky-knight View Post
    Both of your links are broken.

    But I'm going to put out there that most of the time, disk errors are exactly that, and generally means defective disk. However, with Linux things get a bit twisty sometimes because defective RAM can manifest as a defective disk in many cases.

    Either way, it's hardware... and needs testing.

    I've seen the other thread regarding disk, and I caution you to not do what the other poster did.

    You do have this: https://wiki.debian.org/SSDOptimization

    And yes, there's a bit in there to enable automatic trim, and while that's true I've NEVER had to do that. And I've sold and supported Untangle with nothing but SSDs equipped for over half a decade now. TRIM is done by the drive's firmware automatically, and if it's not... the drive is CRAP, get a new one. This is a SERVER, treat it like one.
    I fixed my links.

    The link you posted doesn't load for me (tried multiple browsers).

    I currently have a sandisk in the m.2 slot of the motherboard.

    I should also probably mention other issues I had with this pc which might just lead into faulty ram.

    Here's what happened: Internet just stops, i see the pc lights are still on but the server has definitely crashed since I lost all my connections. So I try to reboot and finish my work on the internet. Pc just turns off and on repeatedly. So i tried different ram stick in different slots ( I have 4 sticks to work with). And 1 of them finally worked and the pc posted. Then I do fresh install and i run in the error code from the 2nd link i posted in the original post, and tonight it crashed again as mentioned earlier. Im tempted to just buy a samsung sata ssd and try that, but I am also tempted to just get a $200 renewed netgear router https://www.amazon.com/dp/B08M7RGQ1S...EGT9D52D8J61CF. I didn't decide yet.
    Last edited by bordyboy; 03-20-2021 at 12:12 PM.

  4. #4
    Untangle Ninja sky-knight's Avatar
    Join Date
    Apr 2008
    Location
    Phoenix, AZ
    Posts
    26,393

    Default

    Well, I'd probably just test RAM until I found a set that was good, and reinstall the platform.

    The problem with Linux and bad RAM is the OS will use whatever RAM it can for filesystem cache. This is great for performance, but it means that bad blocks in RAM will corrupt the filesystem as it goes. So after awhile, defective RAM can eat your install!

    So again, fix the RAM, reinstall, restore backup.
    Rob Sandling, BS:SWE, MCP
    NexgenAppliances.com
    Phone: 866-794-8879 x201
    Email: support@nexgenappliances.com

  5. #5
    Newbie
    Join Date
    Nov 2020
    Posts
    12

    Default

    Quote Originally Posted by sky-knight View Post
    Well, I'd probably just test RAM until I found a set that was good, and reinstall the platform.

    The problem with Linux and bad RAM is the OS will use whatever RAM it can for filesystem cache. This is great for performance, but it means that bad blocks in RAM will corrupt the filesystem as it goes. So after awhile, defective RAM can eat your install!

    So again, fix the RAM, reinstall, restore backup.
    Just crashed again, this time the internet kept half-working so I didn't realize until I couldn't get into a discord call. Restarted it and now I have some new disk errors, maybe helps more in identifying the exact issue:

    Disk errors reported.
    Mar 18 19:28:16 untangle kernel: [83964.606194] ata5.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
    Mar 18 19:28:16 untangle kernel: [83964.607813] ata5.00: configured for UDMA/33
    Mar 18 19:31:58 untangle kernel: [84187.011858] ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
    Mar 18 19:31:58 untangle kernel: [84187.012190] ata5.00: failed to IDENTIFY (I/O error, err_mask=0x100)
    grep: write error: Broken pipe
    tail: error writing 'standard output': Broken pipe

  6. #6
    Untangle Ninja sky-knight's Avatar
    Join Date
    Apr 2008
    Location
    Phoenix, AZ
    Posts
    26,393

    Default

    That would be your SATA controller dying due to a drive fault, which then caused it to reset itself with a slower access protocol and trying to restart and failing.

    So either the SATA controller driver in Debian doesn't like your SATA controller, or the SATA device itself has a faulty controller, or even the SATA cable used to connect the two is defective.
    Rob Sandling, BS:SWE, MCP
    NexgenAppliances.com
    Phone: 866-794-8879 x201
    Email: support@nexgenappliances.com

  7. #7
    Newbie
    Join Date
    Nov 2020
    Posts
    12

    Default

    Quote Originally Posted by sky-knight View Post
    That would be your SATA controller dying due to a drive fault, which then caused it to reset itself with a slower access protocol and trying to restart and failing.

    So either the SATA controller driver in Debian doesn't like your SATA controller, or the SATA device itself has a faulty controller, or even the SATA cable used to connect the two is defective.
    Can’t be sata cable since right now i have m.2 ssd installed directly on the motherboard. Do you think the motherboard might be bad?

  8. #8
    Untangle Ninja sky-knight's Avatar
    Join Date
    Apr 2008
    Location
    Phoenix, AZ
    Posts
    26,393

    Default

    Quote Originally Posted by bordyboy View Post
    Can’t be sata cable since right now i have m.2 ssd installed directly on the motherboard. Do you think the motherboard might be bad?
    That would be physical damage of the M.2 port, or a defective SSD then. The latter is vastly more common, but both are possible.
    Rob Sandling, BS:SWE, MCP
    NexgenAppliances.com
    Phone: 866-794-8879 x201
    Email: support@nexgenappliances.com

  9. #9
    Untangler
    Join Date
    Mar 2018
    Location
    Toronto, Ontario
    Posts
    66

    Default

    if you are referring to my post about fstrim fix, something was introduced in Untangle version 16.2 which gives this SSD errors. Exact same hardware memory, SSD against version 16.1, 16.0 and 15.x and never have a problem till now.

    This is just speculation since i only have home license, thus no official support.

  10. #10
    Untangle Ninja sky-knight's Avatar
    Join Date
    Apr 2008
    Location
    Phoenix, AZ
    Posts
    26,393

    Default

    And again 100% of the Untangle installs I have in the wild are on v16.2.2 right now, and all of them are on SSDs.

    I don't have alerts of drive faults in my dashboard. So I don't have any idea why that's happening in your case.
    Rob Sandling, BS:SWE, MCP
    NexgenAppliances.com
    Phone: 866-794-8879 x201
    Email: support@nexgenappliances.com

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

SEO by vBSEO 3.6.0 PL2