Bug #555: SQLite WAL grows without bound when sqlfs under heavy load - IOCipher - Guardian Project Dev (ARCHIVED SITE)

Copy

Bug #555

SQLite WAL grows without bound when sqlfs under heavy load

Added by hans almost 5 years ago. Updated almost 4 years ago.

Status:

New

Start date:

01/24/2013

Priority:

Normal

Due date:

Assignee:

% Done:

Category:

Target version:

Component:

libsqlfs

Description

When sqlfs is under heavy load (like 3x fsx sessions), the SQLite WAL log grows without bounds until it fills all available space. I used the attached patch to run sqlite3_wal_checkpoint_v2() as regular intervals. I ran three instances of fsx:

~/code/guardianproject/libsqlfs/tests/fsx -d -l 10485760 -o 1048576 /mnt/testfile-big
~/code/guardianproject/libsqlfs/tests/fsx -d -l 10485760 -o 1048576 /mnt/testfile-big2
~/code/guardianproject/libsqlfs/tests/fsx -d -c 25 /mnt/testfile-c25

With sqlite3_wal_checkpoint_v2() every

1000, the WAL log grew to ~248MB in ~1 minute
100, the WAL log grew to ~600MB in ~1 minute
1, the WAL log grew to ~40MB in ~1 minute

Fixes suggested by sjlombardo:

is that we could do nothing, and leave as is. The WAL is pretty much working as documented, it's just never getting a chance to completely checkpoint because the database is never idle. if you don't think that it's likely for this level of continuous activity to occur in live usage, then most applications should never see this sort of behavior
we could introduce some sort of periodic blocking checkpoint, e.g. after every N commits, explicitly force a blocking checkpoint via sqlite3_wal_checkpoint_v2. that would balance out the performance impact of running the checkpoints
we could disable wal, and switch back to using the standard journal mode. The changes we made to obtain reserved locks using begin immediate and the addition of the busy handler should make the library stable under load even without wal, though we'd loose the performance boost on write and the improved read-write concurrency

0002-use-sqlite3_wal_checkpoint_v2-to-slow-WAL-log-growth.patch - better implementation idea (1.23 KB) hans, 01/24/2013 03:02 am

0001-use-sqlite3_wal_checkpoint_v2-to-slow-WAL-log-growth.patch - tests were run with this patch (1.76 KB) hans, 01/24/2013 03:02 am

Associated revisions

Revision c40ac891
Added by Hans-Christoph Steiner over 3 years ago

set WAL journal size limit to 10% of available space or 10M

Previously, there was no limit to the size of the WAL log file, and under
heavy load, it could grow quite a bit. This sets a limit as either 10Megs
or 10% of the available space, whichever is larger.

refs #555 https://dev.guardianproject.info/issues/555

History

#1 Updated by hans almost 5 years ago

File 0002-use-sqlite3_wal_checkpoint_v2-to-slow-WAL-log-growth.patch added
File 0001-use-sqlite3_wal_checkpoint_v2-to-slow-WAL-log-growth.patch added
Target version set to 0.1

0001 is the patch I ran the tests with, 0002 is a patch that has a different implementation of the counter that I think makes more sense.

#2 Updated by abeluck almost 5 years ago

Component set to libsqlfs

#3 Updated by abeluck almost 5 years ago

FYI https://www.sqlite.org/wal.html is required reading for understanding this ticket ;-)

#4 Updated by hans almost 5 years ago

An update from sjlombardo:

been looking into the WAL growth a bit. its a really tricky situation and there may not be an easy answer. I should say the wal reset can't complete… as the wal grows, the reads slowdown, and it becomes worse. I've tried a few options of interleaving checkpoints, but overtime they eventually report that the database is busy, regardless of how you interleave them. it's difficult to say, if there are sufficient cases that the checkpointing can occur, the it shouldn't cause problems, however, under continuous heavy load all bets are off

another possibility would be to do some read/write locking in the library. I've got a small POC, i could push it on a branch of my fork:

https://github.com/sjlombardo/libsqlfs/tree/rwlock branch called rwlock

#5 Updated by hans over 4 years ago

Target version changed from 0.1 to 61

#6 Updated by hans over 4 years ago

sjlombardo posted this possible solution to this issue, it needs to be reviewed:
https://github.com/sjlombardo/libsqlfs/commit/9906642f89187288738d0be69c99e35063de0172

#7 Updated by hans almost 4 years ago

Target version deleted (61)