Old 12-12-2011, 03:23 PM   #1 (permalink)
Master Untangler
 
Join Date: Jan 2011
Posts: 625
johnsonx42 is on a distinguished road
Default disk suddenly full, postgres using all space

A customer box that's never used more than a few gigs of space suddenly ran out of disk space so bad the captive portal page wouldn't load properly. Nothing worked, couldn't browse the web even on machines with that bypassed captive portal.

I went in through SSH; df reported 0 blocks available on /. I deleted enough .gz files from /var/log to get the system functioning again (now showing 53M available on /)

Then I did "curl http://www.untangle.com/download/pat...ric/diskuse.sh | dash" which shows the problem is the postgres database:
Code:
Disk Use of postgres db files
66G     /var/lib/postgresql/8.3
so then in /var/lib/postgresql/8.3/main/base/17844 I found all these 1 gig files:
Code:
-rw------- 1 postgres postgres 1.0G 2011-12-12 15:00 85135
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:12 85135.1
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:16 85135.10
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:16 85135.11
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:18 85135.12
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:18 85135.13
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:19 85135.14
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:20 85135.15
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:20 85135.16
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:20 85135.17
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:21 85135.18
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:21 85135.19
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:14 85135.2
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:21 85135.20
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:24 85135.21
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:24 85135.22
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:25 85135.23
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:25 85135.24
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:25 85135.25
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:26 85135.26
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:26 85135.27
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:26 85135.28
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:26 85135.29
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:15 85135.3
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:29 85135.30
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:29 85135.31
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:29 85135.32
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:29 85135.33
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:29 85135.34
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:29 85135.35
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:29 85135.36
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:31 85135.37
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:36 85135.38
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:40 85135.39
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:15 85135.4
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:45 85135.40
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:48 85135.41
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:52 85135.42
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:56 85135.43
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:59 85135.44
-rw------- 1 postgres postgres 1.0G 2011-12-12 04:03 85135.45
-rw------- 1 postgres postgres 1.0G 2011-12-12 04:07 85135.46
-rw------- 1 postgres postgres 1.0G 2011-12-12 04:11 85135.47
-rw------- 1 postgres postgres 1.0G 2011-12-12 04:15 85135.48
-rw------- 1 postgres postgres 1.0G 2011-12-12 05:08 85135.49
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:15 85135.5
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:15 85135.6
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:16 85135.7
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:16 85135.8
-rw------- 1 postgres postgres 1.0G 2011-12-12 03:16 85135.9
-rw------- 1 postgres postgres 1.0G 2011-12-12 15:07 85142
-rw------- 1 postgres postgres 1.0G 2011-12-12 02:23 85142.1
-rw------- 1 postgres postgres 1.0G 2011-12-11 12:08 85142.10
-rw------- 1 postgres postgres 1.0G 2011-12-11 19:21 85142.11
-rw------- 1 postgres postgres 1.0G 2011-12-12 06:37 85142.12
-rw------- 1 postgres postgres 1.0G 2011-12-08 23:31 85142.2
-rw------- 1 postgres postgres 1.0G 2011-12-09 06:45 85142.3
-rw------- 1 postgres postgres 1.0G 2011-12-09 14:00 85142.4
-rw------- 1 postgres postgres 1.0G 2011-12-09 21:05 85142.5
-rw------- 1 postgres postgres 1.0G 2011-12-10 05:31 85142.6
-rw------- 1 postgres postgres 1.0G 2011-12-10 12:46 85142.7
-rw------- 1 postgres postgres 1.0G 2011-12-10 20:01 85142.8
-rw------- 1 postgres postgres 1.0G 2011-12-11 05:00 85142.9
what can I do to reduce this? I take it I can't just go deleting these.

Thanks for any help.

edit: forgot to mention, this box updated to 9.1 last week some time.

Last edited by johnsonx42; 12-12-2011 at 03:28 PM..
johnsonx42 is offline   Reply With Quote
Old 12-12-2011, 03:33 PM   #2 (permalink)
Master Untangler
 
Join Date: Jan 2011
Posts: 625
johnsonx42 is on a distinguished road
Default

oh, neglected to post the top tables part of the diskuse.sh output:
Code:
Top 20 Tables
               relname               |  reltuples  | relpages
-------------------------------------+-------------+----------
 n_cpd_block_events                  | 4.13673e+08 |  3447271
 n_cpd_block_events_event_id_idx     | 4.13673e+08 |   905761
 n_cpd_block_evt                     | 1.77918e+06 |    15745
 n_mail_message_info_addr            |      835399 |     8080
 n_http_events                       |      182561 |     7264
 n_cpd_block_evt_ts_idx              | 1.77918e+06 |     4880
 n_cpd_block_evt_pkey                | 1.77918e+06 |     4880
 n_mail_message_info                 |      202818 |     2461
 n_mail_message_info_addr_parent_idx |      835399 |     2298
 n_mail_message_info_addr_pkey       |      835399 |     2293
 sessions                            |      116319 |     1719
 n_mail_addrs                        |       29794 |     1435
 n_http_events_request_id_idx        |      182561 |      927
 n_http_events_policy_id_idx         |      182561 |      837
 n_http_evt_req_rid_idx              |        1889 |      722
 n_http_req_line_pkey                |        1889 |      722
 n_http_evt_req_pkey                 |        1889 |      722
 n_http_evt_req_ts_idx               |        1889 |      722
 session_totals                      |       52000 |      714
 pl_endp_sid_idx                     |        1277 |      714
(20 rows)

Oldest record in database
 2011-12-12 15:30:05.282
johnsonx42 is offline   Reply With Quote
Old 12-12-2011, 05:09 PM   #3 (permalink)
Master Untangler
 
Join Date: Jan 2011
Posts: 625
johnsonx42 is on a distinguished road
Default

should I just do this?:
Code:
echo 'drop schema events cascade' | psql -U postgres uvm
/etc/init.d/untangle-vm restart
johnsonx42 is offline   Reply With Quote
Old 12-12-2011, 06:49 PM   #4 (permalink)
Untangler
 
Join Date: Feb 2009
Posts: 43
westin is on a distinguished road
Default

On my old UT , this would happen on occasion. I used something similar to that command. [It is in my documentation at work, so I don't have it in front of me] It would clear it up for a while, but I never figured out exactly what was causing it. I have since upgraded my server, and haven't had the problem since.
westin is offline   Reply With Quote
Old 12-12-2011, 10:16 PM   #5 (permalink)
Master Untangler
 
Join Date: Jan 2011
Posts: 625
johnsonx42 is on a distinguished road
Default

I tried the 'drop schema events cascade', as I didn't mind losing event detail, and it seemed to complete without error but only freed about 420M of disk - the postgres database is still using 65GB.

The top 20 tables section of the diskuse script output now shows:
Code:
Top 20 Tables
             relname             |  reltuples  | relpages
---------------------------------+-------------+----------
 n_cpd_block_events              | 4.13673e+08 |  3447271
 n_cpd_block_events_event_id_idx | 4.13673e+08 |   905761
 n_http_events                   |      182561 |     7264
 sessions                        |      124314 |     1928
 n_mail_addrs                    |       30864 |     1435
 n_http_events_request_id_idx    |      182561 |      927
 n_http_events_policy_id_idx     |      182561 |      837
 session_totals                  |       52000 |      714
 n_http_events_event_id_idx      |      182561 |      681
 sessions_pl_endp_id_idx         |      124314 |      550
 pg_attribute_relid_attnam_index |        7016 |      524
 n_http_totals                   |       26082 |      493
 pg_depend_depender_index        |        7104 |      484
 session_counts                  |       67352 |      470
 sessions_policy_id_idx          |      124314 |      443
 sessions_event_id_idx           |      124314 |      435
 pg_depend_reference_index       |        7104 |      407
 pg_attribute_relid_attnum_index |        7016 |      336
 pg_class_relname_nsp_index      |        1073 |      286
 n_mail_addr_totals              |       19876 |      280
(20 rows)

Oldest record in database
 2011-12-12 21:59:56.808
so it appears it cleaned up the n_cpd_block_evt tables, as those 3 no longer appear in the top 20, but nothing changed on the bigger n_cpd_block_events tables.

Help?
johnsonx42 is offline   Reply With Quote
Old 12-13-2011, 06:25 AM   #6 (permalink)
Untangler
 
Join Date: Feb 2009
Posts: 43
westin is on a distinguished road
Default

I will try to look at my docs when I get into work, and post the command that I used.
westin is offline   Reply With Quote
Old 12-13-2011, 07:27 AM   #7 (permalink)
Newbie
 
Join Date: Dec 2011
Posts: 2
mirage98 is on a distinguished road
Default

The exact same thing has occurred on our untangle box. System was working great, auto updated to 9.1 and captive portal broke.

Logged into box and realized disk usage "overnight" grew. Luckily had second hard drive in box and moved postgres database. Came in this morning and the DB grew another 40gigs and consumed second drive.

n_cpd_block_events
and
n_cpd_block_events_event_id_idx are my two large table culprits.
mirage98 is offline   Reply With Quote
Old 12-13-2011, 07:59 AM   #8 (permalink)
Untangler
 
Join Date: Feb 2009
Posts: 43
westin is on a distinguished road
Default

What I used to use is very similar to what you tried, but I stopped the UVM first, and then ran one additional command at the end, after restarting the UVM. Not sure that it will make a difference, but try:

Code:
/etc/init.d/untangle-vm stop

psql -U postgres uvm

DROP SCHEMA events cascade;

/etc/init.d/untangle-vm start


psql -U postgres -f fixdb.sql uvm
westin is offline   Reply With Quote
Old 12-13-2011, 06:12 PM   #9 (permalink)
Newbie
 
Join Date: Dec 2011
Posts: 2
ngpl is on a distinguished road
Default

Same here on the n_cpd_block_events and n_cpd_block_events_event_id_idx tables having filled 80 GB of drive. I've cleared enough space off to get Untangle up and running, but I can't imagine it's going to last very long. Any patch on the way or simple way to empty that table specifically? fixdb.sql isn't available any longer and I similarly didn't get much space back after the drop schema events cascade.
ngpl is offline   Reply With Quote
Old 12-14-2011, 02:38 AM   #10 (permalink)
Newbie
 
Join Date: Dec 2010
Posts: 10
bowerj01 is on a distinguished road
Default

Having the same problem, 136gb folder:- var/lib/postgresql/8.3/main/base/16387

Build: 9.1.0~svn20111209r30408release9.1-1lenny

Any ideas would be greatly appreciated, don't fancy a 200 mile round trip to default the server.

Thanks in advance for any help.
bowerj01 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT -7. The time now is 07:53 AM.


© 2010 Untangle, Inc. All Rights Reserved.   SEO by vBSEO 3.6.0 PL2