Recently, the Mendeley
bibliography-management and paper-archive tool started encrypting
users’ databases of bibliographic data.
Each such database includes annotations and is the product of a user’s
careful, hard manual work.
Previously, users were able to access their valuable data using
standard tools. Now, that access has been removed.
I have used Mendeley for almost a decade, now, and I value my database
and my data very highly.
Therefore, I looked into what exactly Mendeley has done to prevent me
accessing my data, and figured out an awkward manual technique for
rescuing enough of it to let me switch to a new bibliography
management system.
Background
The reasons for the change are unclear, but it’s possible that this
user-hostile encryption was put into the product in retaliation for a
competing product, Zotero, adding support for importing records from a
Mendeley database. Zotero discusses the problem
here, stating
that
“Mendeley 1.19 and later have begun encrypting the local database,
making it unreadable by Zotero and other standard database tools.”
and the project has also
tweeted that
“The latest version of Mendeley prevents you from getting all of
your data out of the app.”
Certainly, the old technique for accessing the database no longer
works:
$ sqlite3 tonygarnockjones@gmail.com@www.mendeley.com.sqlite
SQLite version 3.23.1 2018-04-10 17:39:29
Enter ".help" for usage hints.
sqlite> .tables
Error: file is not a database
sqlite>
The new encryption definitely prevents me from getting all of my data
from the database I have so carefully constructed.
(Update: Mendeley appears to be
claiming
that they are required by the
GDPR
to encrypt local user files! This is a bizarre claim, both to
me
and to
many,
many
other
people.)
The encryption Mendeley is using
Mendeley is using the
SQLite Encryption Extension
(“SEE”) with a hidden key.
The SEE library is closed-source and very proprietary. Its API is
documented, but its on-disk structures are not (publicly) documented,
and the source code is not publicly available. Applications using SEE
are required to make it impossible to access SEE functionality from
outside the application.
For a while, I had hoped that they might have used the open-source
sqlcipher library.
Unfortunately, they did not. Even more unfortunately, while tooling
exists for sqlcipher, and while SEE and sqlcipher are API-compatible,
the resulting files are neither binary- nor tooling-compatible.
It is not clear where the key material is coming from—whether the
key is per-database, or whether it is a hard-coded key that is part of
the Mendeley application binary, or whether it is a key that is
retrieved from the Mendeley online API.
The question is ultimately moot, given the closedness of the SEE
format and library. Even if I had the key, there is no tooling for
making use of it in any other context than the Mendeley application
itself.
How to rescue your data
Here’s the technique I used to rescue my data for long enough to allow
a Zotero import to work.
Unfortunately, it’s not easy. It works only on Linux (but
see below if you are using a Mac), and relies
on being able to run gdb
to call an internal SQLite SEE routine,
sqlite3_rekey_v2
.
In the instructions below, modify file paths etc. to line up with your
own Mendeley and Unix usernames etc.
-
Quit Mendeley. You don’t want it running while you’re fiddling
with its database.
-
BACK UP YOUR DATABASE. You will want to put things back the
way they were after you’re done so you can use Mendeley again.
THE REST OF THIS PROCEDURE MODIFIES THE DATABASE FILE ON DISK
in ways that I do not know whether the Mendeley application can
handle.
cd ~/.local/share/data/Mendeley\ Ltd./Mendeley\ Desktop/
cp tonygarnockjones@gmail.com@www.mendeley.com.sqlite ~/backup-encrypted.sqlite
-
Start Mendeley under the control of gdb
.
-
Add a breakpoint that captures the moment a SQLite database is opened.
-
Start the program.
-
The program will stop at the breakpoint several times. Keep
continuing the program until the string pointed to by $rdi
names
the file you backed up in the step above.
Thread 1 "mendeleydesktop" hit Breakpoint 1, 0x000000000101b1b0 in sqlite3_open_v2 ()
(gdb) x/s $rdi
0x1dca928: "/home/tonyg/.local/share/data/Mendeley Ltd./Mendeley Desktop/Settings.sqlite"
(gdb) c
Continuing.
Thread 1 "mendeleydesktop" hit Breakpoint 1, 0x000000000101b1b0 in sqlite3_open_v2 ()
(gdb) x/s $rdi
0x1dcb318: "/home/tonyg/.local/share/data/Mendeley Ltd./Mendeley Desktop/Settings.sqlite"
(gdb) c
(… repeats a few times …)
Thread 1 "mendeleydesktop" hit Breakpoint 1, 0x000000000101b1b0 in sqlite3_open_v2 ()
(gdb) x/s $rdi
0x25f1818: "/home/tonyg/.local/share/data/Mendeley Ltd./Mendeley Desktop/tonygarnockjones@gmail.com@www.mendeley.com.sqlite"
-
Now, set a breakpoint for the moment the key is supplied to SEE.
We don’t care about the key itself (for reasons discussed above),
but we do care to find the moment just after sqlite3_key
has
returned.
(gdb) b sqlite3_key
Breakpoint 2 at 0x101b2c0
(gdb) c
Continuing.
Thread 1 "mendeleydesktop" hit Breakpoint 2, 0x000000000101b2c0 in sqlite3_key ()
(gdb) info registers
rax 0x7fffffffc6b0 140737488340656
rbx 0x25f0590 39781776
rcx 0x7fffea9a0c40 140737129352256
rdx 0x20 32
rsi 0x260fd68 39910760
rdi 0x25ef4e8 39777512
rbp 0x7fffffffc730 0x7fffffffc730
rsp 0x7fffffffc688 0x7fffffffc688
r8 0xc1 193
r9 0x7fffea9a0cc0 140737129352384
r10 0x0 0
r11 0x1 1
r12 0x7fffffffc6b0 140737488340656
r13 0x7fffffffc6a0 140737488340640
r14 0x7fffffffc790 140737488340880
r15 0x7fffffffc790 140737488340880
rip 0x101b2c0 0x101b2c0 <sqlite3_key>
eflags 0x202 [ IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
-
Copy down the value of $rdi
from the info registers
output. It
is the pointer to the open SQLite database handle. Then, finish
execution of sqlite3_key
.
(gdb) fin
Run till exit from #0 0x000000000101b2c0 in sqlite3_key ()
0x0000000000f94e54 in SqliteDatabase::openInternal(QString const&, SqlDatabaseKey*) ()
-
Use gdb
’s ability to call C functions to rekey the database to
the null key, thereby decrypting it in-place and allowing Zotero
import to do its work.
Use the value for $rdi
you noted down in the previous step as
the first argument to sqlite3_rekey_v2
, and zero for the other
three arguments.
(gdb) p (int) sqlite3_rekey_v2(0x25ef4e8, 0, 0, 0)
$1 = 0
-
If you see $1 = 0
from the rekey command, all is well, and you
may now use Zotero to import your Mendeley database. (Update:
See below if you don’t see $1 = 0
; for example, you might see
$1 = 8
.)
While this is happening, leave gdb
running and don’t touch it!
DO NOT QUIT GDB OR RUN MENDELEY while the import is
proceeding. Who knows what might happen to your carefully
decrypted database if you do!
In fact, before you start Zotero, you might like to copy your
decrypted database to somewhere safe, so you don’t have to do this
again:
cd ~/.local/share/data/Mendeley\ Ltd./Mendeley\ Desktop/
cp tonygarnockjones@gmail.com@www.mendeley.com.sqlite ~/backup-decrypted.sqlite
-
Once the import is complete, quit gdb
and terminate the
associated partially-initialised Mendeley process.
(gdb) quit
A debugging session is active.
Inferior 1 [process 32750] will be killed.
Quit anyway? (y or n) y
[322:322:0100/000000.491674:ERROR:broker_posix.cc(43)] Invalid node channel message
-
Finally, restore your backed-up copy of the encrypted database, so
that Mendeley will continue to run OK.
cd ~/.local/share/data/Mendeley\ Ltd./Mendeley\ Desktop/
cp ~/backup-encrypted.sqlite tonygarnockjones@gmail.com@www.mendeley.com.sqlite
You can now, if you like, in addition to using your new Zotero
database as normal, access the raw contents of your decrypted Mendeley
database using the standard SQLite tools, like you could in previous
Mendeley versions:
$ sqlite3 ~/backup-decrypted.sqlite
SQLite version 3.23.1 2018-04-10 17:39:29
Enter ".help" for usage hints.
sqlite> .tables
CanonicalDocuments DocumentZotero NotDuplicates
DataCleaner Documents Profiles
DocumentCanonicalIds EventAttributes RemoteDocumentNotes
DocumentContributors EventLog RemoteDocuments
DocumentDetailsBase FileHighlightRects RemoteFileHighlights
DocumentFields FileHighlights RemoteFileNotes
DocumentFiles FileNotes RemoteFolders
DocumentFolders FileReferenceCountsView Resources
DocumentFoldersBase FileViewStates RunsSinceLastCleanup
DocumentKeywords Files SchemaVersion
DocumentNotes Folders Settings
DocumentReferences Groups Stats
DocumentTags HtmlLocalStorage SyncTokens
DocumentUrls ImportHistory ZoteroLastSync
DocumentVersion LastReadStates
sqlite>
Conclusion
No user should have to do this to access their data. I’m lucky I have
the skills to do it at all.
I’m sad that Mendeley violated my trust this way, but glad I have an
exit strategy now.
Update: what to do if you see $1 = 8
If your p sqlite3_rekey_v2(...)
attempt fails, with (say) $1 = 8
as the outcome, then you may have been victim of an unfortunate thread
interleaving, or you might have caught a “spurious” opening of the
database. It seems that the program sometimes opens the main database
at least once in some odd way, before opening it properly for
long-term use.
If you think it’s threading, you could try abandoning the procedure
and restarting from the beginning: just quit gdb
and restart the
procedure from mendeleydesktop --debug
.
To deal with the “spurious” openings, experiment to see if the program
opens the main database a second time. Run the procedure all the way
up to the p sqlite3_rekey_v2(...)
step, but do not run
sqlite3_rekey_v2
. Instead, just type c
to continue, returning to
the step where you inspect each call to sqlite3_open_v2
, waiting for
one with $rdi
pointing to a string with the right database filename.
When you see it come round again, then try the sqlite3_rekey_v2
step. If you see $1 = 0
this time, you’re all set, and can proceed
as described above for a successful call to sqlite3_rekey_v2
.
If you’re still having problems with this procedure, do feel free to
email me, and I’ll try to help if I can.
Update: Using a Mac
Steve Laskaridis
reports
success using this procedure on a Mac. He says that the necessary
changes to the procedure above are:
- Use
lldb
instead of gdb
;
- Use the
register read
command instead of info registers
; and
- The database file directory is
~/Library/Application Support/Mendeley Desktop/
.
He also published
this tweet,
which includes a screenshot of the procedure running on a Mac.
Update: Flatpak-based Mendeley distributions
Greydon Gilmore has published an
article,
based on this one, including instructions for decrypting the database
when using a Flatpak-based Mendeley installation.