Mendeley encrypts your data so you can't access it

Recently, the Mendeley bibliography-management and paper-archive tool started encrypting users’ databases of bibliographic data.

Each such database includes annotations and is the product of a user’s careful, hard manual work.

Previously, users were able to access their valuable data using standard tools. Now, that access has been removed.

I have used Mendeley for almost a decade, now, and I value my database and my data very highly.

Therefore, I looked into what exactly Mendeley has done to prevent me accessing my data, and figured out an awkward manual technique for rescuing enough of it to let me switch to a new bibliography management system.

Background

The reasons for the change are unclear, but it’s possible that this user-hostile encryption was put into the product in retaliation for a competing product, Zotero, adding support for importing records from a Mendeley database. Zotero discusses the problem here, stating that

“Mendeley 1.19 and later have begun encrypting the local database, making it unreadable by Zotero and other standard database tools.”

and the project has also tweeted that

“The latest version of Mendeley prevents you from getting all of your data out of the app.”

Certainly, the old technique for accessing the database no longer works:

$ sqlite3 tonygarnockjones@gmail.com@www.mendeley.com.sqlite
SQLite version 3.23.1 2018-04-10 17:39:29
Enter ".help" for usage hints.
sqlite> .tables
Error: file is not a database
sqlite>

The new encryption definitely prevents me from getting all of my data from the database I have so carefully constructed.

(Update: Mendeley appears to be claiming that they are required by the GDPR to encrypt local user files! This is a bizarre claim, both to me and to many, many other people.)

The encryption Mendeley is using

Mendeley is using the SQLite Encryption Extension (“SEE”) with a hidden key.

The SEE library is closed-source and very proprietary. Its API is documented, but its on-disk structures are not (publicly) documented, and the source code is not publicly available. Applications using SEE are required to make it impossible to access SEE functionality from outside the application.

For a while, I had hoped that they might have used the open-source sqlcipher library. Unfortunately, they did not. Even more unfortunately, while tooling exists for sqlcipher, and while SEE and sqlcipher are API-compatible, the resulting files are neither binary- nor tooling-compatible.

It is not clear where the key material is coming from—whether the key is per-database, or whether it is a hard-coded key that is part of the Mendeley application binary, or whether it is a key that is retrieved from the Mendeley online API.

The question is ultimately moot, given the closedness of the SEE format and library. Even if I had the key, there is no tooling for making use of it in any other context than the Mendeley application itself.

How to rescue your data

Here’s the technique I used to rescue my data for long enough to allow a Zotero import to work.

Unfortunately, it’s not easy. It works only on Linux (but see below if you are using a Mac), and relies on being able to run gdb to call an internal SQLite SEE routine, sqlite3_rekey_v2.

In the instructions below, modify file paths etc. to line up with your own Mendeley and Unix usernames etc.

  1. Quit Mendeley. You don’t want it running while you’re fiddling with its database.

  2. BACK UP YOUR DATABASE. You will want to put things back the way they were after you’re done so you can use Mendeley again. THE REST OF THIS PROCEDURE MODIFIES THE DATABASE FILE ON DISK in ways that I do not know whether the Mendeley application can handle.

    cd ~/.local/share/data/Mendeley\ Ltd./Mendeley\ Desktop/
    cp tonygarnockjones@gmail.com@www.mendeley.com.sqlite ~/backup-encrypted.sqlite
    
  3. Start Mendeley under the control of gdb.

    mendeleydesktop --debug
    
  4. Add a breakpoint that captures the moment a SQLite database is opened.

    (gdb) b sqlite3_open_v2
    
  5. Start the program.

    (gdb) run
    
  6. The program will stop at the breakpoint several times. Keep continuing the program until the string pointed to by $rdi names the file you backed up in the step above.

    Thread 1 "mendeleydesktop" hit Breakpoint 1, 0x000000000101b1b0 in sqlite3_open_v2 ()
    (gdb) x/s $rdi
    0x1dca928:	"/home/tonyg/.local/share/data/Mendeley Ltd./Mendeley Desktop/Settings.sqlite"
    (gdb) c
    Continuing.
    
    Thread 1 "mendeleydesktop" hit Breakpoint 1, 0x000000000101b1b0 in sqlite3_open_v2 ()
    (gdb) x/s $rdi
    0x1dcb318:	"/home/tonyg/.local/share/data/Mendeley Ltd./Mendeley Desktop/Settings.sqlite"
    (gdb) c
    

    (… repeats a few times …)

    Thread 1 "mendeleydesktop" hit Breakpoint 1, 0x000000000101b1b0 in sqlite3_open_v2 ()
    (gdb) x/s $rdi
    0x25f1818:	"/home/tonyg/.local/share/data/Mendeley Ltd./Mendeley Desktop/tonygarnockjones@gmail.com@www.mendeley.com.sqlite"
    
  7. Now, set a breakpoint for the moment the key is supplied to SEE. We don’t care about the key itself (for reasons discussed above), but we do care to find the moment just after sqlite3_key has returned.

    (gdb) b sqlite3_key
    Breakpoint 2 at 0x101b2c0
    (gdb) c
    Continuing.
    
    Thread 1 "mendeleydesktop" hit Breakpoint 2, 0x000000000101b2c0 in sqlite3_key ()
    (gdb) info registers 
    rax            0x7fffffffc6b0	140737488340656
    rbx            0x25f0590	39781776
    rcx            0x7fffea9a0c40	140737129352256
    rdx            0x20	32
    rsi            0x260fd68	39910760
    rdi            0x25ef4e8	39777512
    rbp            0x7fffffffc730	0x7fffffffc730
    rsp            0x7fffffffc688	0x7fffffffc688
    r8             0xc1	193
    r9             0x7fffea9a0cc0	140737129352384
    r10            0x0	0
    r11            0x1	1
    r12            0x7fffffffc6b0	140737488340656
    r13            0x7fffffffc6a0	140737488340640
    r14            0x7fffffffc790	140737488340880
    r15            0x7fffffffc790	140737488340880
    rip            0x101b2c0	0x101b2c0 <sqlite3_key>
    eflags         0x202	[ IF ]
    cs             0x33	51
    ss             0x2b	43
    ds             0x0	0
    es             0x0	0
    fs             0x0	0
    gs             0x0	0
    
  8. Copy down the value of $rdi from the info registers output. It is the pointer to the open SQLite database handle. Then, finish execution of sqlite3_key.

    (gdb) fin
    Run till exit from #0  0x000000000101b2c0 in sqlite3_key ()
    0x0000000000f94e54 in SqliteDatabase::openInternal(QString const&, SqlDatabaseKey*) ()
    
  9. Use gdb’s ability to call C functions to rekey the database to the null key, thereby decrypting it in-place and allowing Zotero import to do its work.

    Use the value for $rdi you noted down in the previous step as the first argument to sqlite3_rekey_v2, and zero for the other three arguments.

    (gdb) p (int) sqlite3_rekey_v2(0x25ef4e8, 0, 0, 0)
    $1 = 0
    
  10. If you see $1 = 0 from the rekey command, all is well, and you may now use Zotero to import your Mendeley database. (Update: See below if you don’t see $1 = 0; for example, you might see $1 = 8.)

    While this is happening, leave gdb running and don’t touch it! DO NOT QUIT GDB OR RUN MENDELEY while the import is proceeding. Who knows what might happen to your carefully decrypted database if you do!

    In fact, before you start Zotero, you might like to copy your decrypted database to somewhere safe, so you don’t have to do this again:

    cd ~/.local/share/data/Mendeley\ Ltd./Mendeley\ Desktop/
    cp tonygarnockjones@gmail.com@www.mendeley.com.sqlite ~/backup-decrypted.sqlite
    
  11. Once the import is complete, quit gdb and terminate the associated partially-initialised Mendeley process.

    (gdb) quit
    A debugging session is active.
    
        Inferior 1 [process 32750] will be killed.
    
    Quit anyway? (y or n) y
    [322:322:0100/000000.491674:ERROR:broker_posix.cc(43)] Invalid node channel message
    
  12. Finally, restore your backed-up copy of the encrypted database, so that Mendeley will continue to run OK.

    cd ~/.local/share/data/Mendeley\ Ltd./Mendeley\ Desktop/
    cp ~/backup-encrypted.sqlite tonygarnockjones@gmail.com@www.mendeley.com.sqlite
    

You can now, if you like, in addition to using your new Zotero database as normal, access the raw contents of your decrypted Mendeley database using the standard SQLite tools, like you could in previous Mendeley versions:

$ sqlite3 ~/backup-decrypted.sqlite 
SQLite version 3.23.1 2018-04-10 17:39:29
Enter ".help" for usage hints.
sqlite> .tables
CanonicalDocuments       DocumentZotero           NotDuplicates          
DataCleaner              Documents                Profiles               
DocumentCanonicalIds     EventAttributes          RemoteDocumentNotes    
DocumentContributors     EventLog                 RemoteDocuments        
DocumentDetailsBase      FileHighlightRects       RemoteFileHighlights   
DocumentFields           FileHighlights           RemoteFileNotes        
DocumentFiles            FileNotes                RemoteFolders          
DocumentFolders          FileReferenceCountsView  Resources              
DocumentFoldersBase      FileViewStates           RunsSinceLastCleanup   
DocumentKeywords         Files                    SchemaVersion          
DocumentNotes            Folders                  Settings               
DocumentReferences       Groups                   Stats                  
DocumentTags             HtmlLocalStorage         SyncTokens             
DocumentUrls             ImportHistory            ZoteroLastSync         
DocumentVersion          LastReadStates         
sqlite> 

Conclusion

No user should have to do this to access their data. I’m lucky I have the skills to do it at all.

I’m sad that Mendeley violated my trust this way, but glad I have an exit strategy now.

Update: what to do if you see $1 = 8

If your p sqlite3_rekey_v2(...) attempt fails, with (say) $1 = 8 as the outcome, then you may have been victim of an unfortunate thread interleaving, or you might have caught a “spurious” opening of the database. It seems that the program sometimes opens the main database at least once in some odd way, before opening it properly for long-term use.

If you think it’s threading, you could try abandoning the procedure and restarting from the beginning: just quit gdb and restart the procedure from mendeleydesktop --debug.

To deal with the “spurious” openings, experiment to see if the program opens the main database a second time. Run the procedure all the way up to the p sqlite3_rekey_v2(...) step, but do not run sqlite3_rekey_v2. Instead, just type c to continue, returning to the step where you inspect each call to sqlite3_open_v2, waiting for one with $rdi pointing to a string with the right database filename. When you see it come round again, then try the sqlite3_rekey_v2 step. If you see $1 = 0 this time, you’re all set, and can proceed as described above for a successful call to sqlite3_rekey_v2.

If you’re still having problems with this procedure, do feel free to email me, and I’ll try to help if I can.

Update: Using a Mac

Steve Laskaridis reports success using this procedure on a Mac. He says that the necessary changes to the procedure above are:

  1. Use lldb instead of gdb;
  2. Use the register read command instead of info registers; and
  3. The database file directory is ~/Library/Application Support/Mendeley Desktop/.

He also published this tweet, which includes a screenshot of the procedure running on a Mac.

Update: Flatpak-based Mendeley distributions

Greydon Gilmore has published an article, based on this one, including instructions for decrypting the database when using a Flatpak-based Mendeley installation.