Page 1 of 4 123 ... LastLast
Results 1 to 10 of 33
  1. #1
    Master Untangler
    Join Date
    Oct 2010
    Posts
    115

    Default "Server Load Is Very High" notifications for the last 3 days

    For the last 3 days at approximately the same time each day we have been receiving "Server Load Is Very High" notifications emails and upon checking the CPU load and Memory usage at that point they are both very high.

    CPU load goes into the red zone and the Memory shows very little free. End users start having problems accessing websites, remote users start having problems connecting, and access to the main Untangle overview screen takes a long time to load.

    After about 30 minutes the CPU and Memory load settles back down to normal levels and everything is fine again.

    Under normal day to day usage Memory sits at around 50-60% and CPU load is almost at zero.

    The main thing we've noticed is that when the problems occur it is roughly the same time each day (10:19AM on the first day, 10:26AM on the second day and 11:09AM on the third day).

    The sessions per minute at the times of the problem are not excessive and are actually a lot less than at other times during the day.

    We haven't made any changes at all and these problems only started happening on Tuesday.

    We're running Untangle on an (admitedly older) Dell Vostro box with a Quad Core processor and 4GB but up until this week we have had zero problems with Untangle in all the time we've had it. 65 users in total.

    It seems that something on Untangle might be triggering at roughly the same time each day but as we've not made any changes at all we're not sure what this could be.

    I was wondering if anyone had any ideas of what might be causing this?

    Thanks in advance

  2. #2
    Untangle Ninja sky-knight's Avatar
    Join Date
    Apr 2008
    Location
    Phoenix, AZ
    Posts
    26,488

    Default

    Are you using Intrusion Prevention?

    I had two units post the v16.4.1 upgrade start doing this. I was able to get into one of the units while it was loaded up and saw Surricata sucking up CPU like crazy.

    Powering off the module cleared the condition for me, I've not had enough time yet to know if simply pulling the module and reinstalling it will fix it. But so far so good.
    Rob Sandling, BS:SWE, MCP
    NexgenAppliances.com
    Phone: 866-794-8879 x201
    Email: support@nexgenappliances.com

  3. #3
    Master Untangler
    Join Date
    Oct 2010
    Posts
    115

    Default

    No we're not using the Intrusion Prevention app.

    We have 4.19.0-11-untangle-amd64. Uptime is just over 1 year.

    If there was a way of seeing exactly what is using up the CPU and Memory (similar to Task Manager in Windows) I think that would help.

    Unfortunately we don't have a support contract. The paid apps we use are the WAN Balance and WAN Failover.

  4. #4
    Untangler jcoffin's Avatar
    Join Date
    Aug 2008
    Location
    Lake Tahoe
    Posts
    9,655

    Default

    Quote Originally Posted by BadBoyHouse View Post
    We have 4.19.0-11-untangle-amd64. Uptime is just over 1 year.
    That is the kernel version not the software version. What is the software version?

    Quote Originally Posted by BadBoyHouse View Post
    If there was a way of seeing exactly what is using up the CPU and Memory (similar to Task Manager in Windows) I think that would help.
    Command top on the CLI will show the processes using the CPU in order of load.
    Attention: Support and help on the Untangle Forums is provided by
    volunteers and community members like yourself.
    If you need Untangle support please call or email support@untangle.com

  5. #5
    Master Untangler
    Join Date
    Oct 2010
    Posts
    115

    Default

    Quote Originally Posted by jcoffin View Post
    That is the kernel version not the software version. What is the software version?

    Command top on the CLI will show the processes using the CPU in order of load.
    Is there a guide or walkthrough on how to run this command? Accessing the CLI isn't something I've done before.

    Sorry the Build number is - 16.4.1.20211102T072340.200b87d9a3-1buster
    Last edited by BadBoyHouse; 12-09-2021 at 10:46 AM. Reason: mistake

  6. #6
    Master Untangler
    Join Date
    Oct 2010
    Posts
    115

    Default

    Quote Originally Posted by sky-knight View Post
    Are you using Intrusion Prevention?

    I had two units post the v16.4.1 upgrade start doing this. I was able to get into one of the units while it was loaded up and saw Surricata sucking up CPU like crazy.

    Powering off the module cleared the condition for me, I've not had enough time yet to know if simply pulling the module and reinstalling it will fix it. But so far so good.
    What method did you use for checking the CPU usage at the point the spikes were occuring?

    I'm hoping that if I can figure out how to view exactly what is using the CPU at the times of our spikes it might help me see what the root cause of the spikes is.

  7. #7
    Untangle Ninja sky-knight's Avatar
    Join Date
    Apr 2008
    Location
    Phoenix, AZ
    Posts
    26,488

    Default

    I used the method as above, what you do is SSH into the box.

    SSH clients are built into powershell or Windows Terminal if you're a windows user. The login is root, and the password is whatever the first admin account created on that Untangle uses as a web UI password.

    To ssh into an Untangle you need to create an access rule for the traffic: config -> network -> advanced -> access rules.

    There is a default SSH rule there you can look at to get a basic idea of what rule you need, you need to COPY that rule, and add a fourth flag either source address, or source interface. Whatever you do, DO NOT ENABLE the default SSH rule, because it will open SSH to the world without restriction, and if you do not have a very strong password your box will find itself cracked very soon afterward.

    Anyway, once you have a properly formed access rule so that SSH can only work from your trusted location or network, open up powershell / windows terminal and type in:

    Code:
    ssh root@untangle.ip.goes.here
    If it works you'll get a warning about finger prints to type yes to, and then a prompt for a password, once you're at the # prompt just type in the command:
    Code:
    top
    You'll see an output that's sort of like task manager, with the data you want.

    Ctrl+C to exit top

    Type the exit command to disconnect from the SSH session, and exit again to close the command prompt.
    Rob Sandling, BS:SWE, MCP
    NexgenAppliances.com
    Phone: 866-794-8879 x201
    Email: support@nexgenappliances.com

  8. #8
    Untangler
    Join Date
    Nov 2017
    Posts
    45

    Default

    I've been seeing this since 16. I get the email alerts, but can never catch is real-time. This server is a Xeon with 24 CPU, and 96GB RAM. Here's yesterday.
    [ Load (1-minute): 301.06, Load (5-minute): 341.45, Load (15-minute): 150.27

    The problem seems to shift by about an hour each day for me, but it's not every day, sometimes 3 times a week, sometimes 5 times. It happens at offpeak and onpeak times.

    I really need to open a ticket, I just keep hoping to catch it happening so I don't have to fumble about trying to pinpoint what's causing it.

  9. #9
    Master Untangler
    Join Date
    Jan 2011
    Posts
    103

    Default

    Started a thread along similar lines here:

    https://forums.untangle.com/ng-firew...cpu-usage.html

    Also raised a ticket for the issue and was told "Your CPU looks a bit old, you should upgrade, get back to me once you have done this..." WT* ?

    Also, in reference to what Rob was saying about Intrusion Prevention, I've just looked at the intrusion prevention reports, all of which look normal, except when I look at Reports -> Intrusion Prevention -> All / Logged / Blocked events which are all blank, no events listed, despite all the other reports showing data..., something looks broken to me...

    Turned the module on and off, but still the data is missing from the reports.., hate doing it, but time for a reboot ?
    Last edited by tescophil; 12-30-2021 at 05:58 AM.
    dashpuppy likes this.

  10. #10
    Master Untangler
    Join Date
    Jul 2010
    Location
    Nanaimo B.C
    Posts
    708

    Default

    Quote Originally Posted by tescophil View Post
    Started a thread along similar lines here:

    https://forums.untangle.com/ng-firew...cpu-usage.html

    Also raised a ticket for the issue and was told "Your CPU looks a bit old, you should upgrade, get back to me once you have done this..." WT* ?

    Also, in reference to what Rob was saying about Intrusion Prevention, I've just looked at the intrusion prevention reports, all of which look normal, except when I look at Reports -> Intrusion Prevention -> All / Logged / Blocked events which are all blank, no events listed, despite all the other reports showing data..., something looks broken to me...

    Turned the module on and off, but still the data is missing from the reports.., hate doing it, but time for a reboot ?
    Nothing wrong with a reboot, what is the information on the hardware ? HDD CPU Ram Nic's in box ?
    Started Youtube Channel, Have a question about Untangle Ask me : jason @ jasonslab.ca
    https://www.youtube.com/c/jasonslabvideos << Please like and subscribe, helps me out !!

Page 1 of 4 123 ... LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

SEO by vBSEO 3.6.0 PL2