rand[om]

rand[om]

med ∩ ml

Building a remote SQLite explorer

One of the main pain points of using SQLite in production deployments or VMs is managing the database. There are lots of database GUIs, but only work with local SQLite databases. Managing an SQLite database remotely requires:

  • Adding a new service to the deployment (like Adminer, sqlite-web or postlite)
  • Giving the new service permissions to access the volume with the database
  • Exposing a port to access the service The alternative is usually SSH’ing to the remote VM and use the sqlite3 CLI to manage or explore the database.

With this in mind, I decided to build a remote SQLite management GUI that does not require running any service in the remote VM and only needs an SSH connection between you and the remote machine.

Turning a database into an API

To access the remote database, we need some communication format between the app’s code and the database data. Luckily for us, the sqlite3 CLI has a -json flag that will turn the output into JSON. We can use that to send CLI commands over the SSH connection and read the JSON output as the response. One thing to note is that some pre-installed sqlite3 CLIs have not been compiled with the -json flag enabled. You may need to install/scp/compile another sqlite3 binary in the remote to do this.

ssh $SSH_USER@$SSH_HOST "sqlite3 -json chinook.sqlite3 'select * from Artist limit 10'"

The output looks like:

[
  { "ArtistId": 1, "Name": "AC/DC" },
  { "ArtistId": 2, "Name": "Accept" },
  { "ArtistId": 3, "Name": "Aerosmith" },
  { "ArtistId": 4, "Name": "Alanis Morissette" },
  { "ArtistId": 5, "Name": "Alice In Chains" },
  { "ArtistId": 6, "Name": "Antônio Carlos Jobim" },
  { "ArtistId": 7, "Name": "Apocalyptica" },
  { "ArtistId": 8, "Name": "Audioslave" },
  { "ArtistId": 9, "Name": "BackBeat" },
  { "ArtistId": 10, "Name": "Billy Cobham" }
]

(Note: I’m using the chinook database as an example)

Multiplexing connections

The problem with sending commands over SSH is that it requires creating a new SSH connection/session for each command. To avoid this, we can use connection multiplexing. We can tell SSH to keep the connection open as a socket and reuse that if another connection is made before some time threshold. With this, we removed the overhead of creating new TCP connections and negotiating the secure connection for each query.

ssh -o ControlPersist=1m -o ControlMaster=auto -o ControlPath=~/.ssh/persistent-socket $SSH_USER@$SSH_HOST "sqlite3 -json chinook.sqlite3 'select * from Artist limit 10'"

ControlPersist=1m will make the connection persist for 1 minute after the last command/connection. You can learn more about SSH multiplexing here.

I did some tests from my laptop in Spain, querying a database (different than chinook) in a server in Germany.

A simple query (select * from sqlite_master) reported the following timings:

real    0m0.147s
user    0m0.032s
sys     0m0.018s

For a query which selects more data like select * from user limit 100, the timings are:

real    0m0.160s
user    0m0.020s
sys     0m0.011s

I think this is fast enough to “build an API” on top of sending CLI commands over a multiplexed SSH connection.

Extra benchmarks

I also queried a chinook database running in a VM in the US to have more geo-latency. When querying 10 “Artists”:

ssh -o ControlPersist=1m -o ControlMaster=auto -o ControlPath=~/.ssh/persistent-socket $SSH_USER@$SSH_HOST "sqlite3 -json chinook.sqlite3 'select * from Artist limit 10'"

The timings are:

real    0m0.317s
user    0m0.011s
sys     0m0.016s

For 200 “Artists”:

real    0m0.458s
user    0m0.010s
sys     0m0.015s

Complex queries

One problem with this approach is sending complex SQL queries. We don’t want to send huge commands with weird quote escaping. But we can just pass the query as input to the command. From the command line, we can store the query in a text file:

ssh my_host "sqlite3 -json chinook.sqlite3" < query.txt

Or as a Python string using subprocess.run:

import subprocess

query = """
SELECT * FROM users LIMIT 10;
"""

p = subprocess.run([...], input=query)

litexplore

This is how litexplore was born. It’s a (very basic for now) web UI on top of the techniques mentioned above.

When you start the server, you’re prompted with a form, asking for the SSH remote (as defined in your ~/.ssh/config), the path of the SQLite database in the remote and (optionally) a path to the sqlite3 CLI.

litexplore-login

The app will make sure that the remote sqlite3 CLI has the -json option available and proceed to show a basic UI to explore the database.

litexplore-chinook-tables

Table views will link to foreign keys in each entry.

litexplore-table-ui

The app still has some limitations, but has many features in the roadmap. You can follow the development and provide feedback on GitHub.

The main reason why I’ve started building litexplore is that you just need an SSH connection. No extra apps, containers or open ports are needed in the remote!

Query parameters, SQL injection and security

The main purpose of litexplore is navigating a remote database without having to use the terminal/CLI. By default, all the commands open the remote database in read-only mode, so a potential malicious query can’t modify a database.

sqlite3 -json file://test.sqlite3?mode=ro

The sqlite3 CLI can also pass parameters to queries by using a temporary table. We can use this feature by using the .parameter (or the .param alias) command. We can then send queries like this:

.param init
.param set :limit "50"
.param set :offset "0"
select _rowid_ as rowid, * from Track limit :limit offset :offset

This ensures that queries are properly parametrized instead of having to manually escape the parameters when constructing the string.

Closing words

The development of litexplore is just starting, feel free to give any feedback on the GitHub repo. I’m also considering turning it into an (Electron) installable app. Or creating an SQLite VFS to query remote databases through SSH without having to run a web app.