Journaling - by which I mean keeping a personal diary - didn’t come naturally to me. I rarely miss a day now and have found it useful both in the moment and as a resource to return to. The key has been to reduce friction and to go all-in on voice.
How It Started
In late 2020, during “the pandemic”, I created a new Google Doc and typed:
2020-12-06 9:30pm
First session, after the initial measurement a week ago. I think mostly rested except maybe close to the hip; definitely tired so not in best place for training. 6kg/side for Tailor, 6kg for the rest. Horse Stance hurts at the *top*? Had about an 80cm heel-to-heel width, getting thighs close-ish to parallel to the ground. [...]
A few times per week, I would add similar entries, with varying degrees of sparsity, on my workouts. Eventually this stopped and picked up again a few months later, in February 2023, stating,
Haven’t been using this for a long time, going to use it as a personal log for everything now. I’ve been doing that for work for well over a year now and works great.
That’s right, I got into journaling through work, where it’s evidently useful (took me long enough). As promised in that entry the habit did stick in private life too, so I can forever date treasured memories such as this one:
Ugh sliced my left index finger w bread knife
or the below, variations of which are sadly a commonplace occurrence,
Just really crap eating basically, finished the day with chocolate followed by Magnum followed by chocolate again
But I remember that it was difficult, for a long time, to maintain the habit. It felt like a chore, one in which I was relying on willpower as a strategy. The need to type these entries out - often in bed on the phone after the kids were in bed and I was more than ready for the same - discouraged verbosity and effectively rendered many entries impersonal and terse.
That was fine at work, but if one my kids did something new and cute one day or I’d had a big fight, the diary would neither serve as a prompt to reflect on this nor would it leave much of an account of what actually transpired. My personal journal was useful, but left a lot to be desired.
Things got noticeably better when voice typing on Android improved (and I realized that disabling multi-language would significantly improve transcription accuracy and latency). I could dictate my diary while reading along on the screen and correcting any errors. This made entries significantly more verbose and their content more conversational, allowing me to more easily reflect on the happenings of the day, at least on some days.
But it still wasn’t good enough — too much friction, and lots of transcription errors slipping through, requiring eyes on the screen and yet resulting in entries that are tricky to decipher even today. When I took an extended leave from work this past year, I decided to revisit once again and looked into third-party transcription services.
Not wanting to host any infrastructure myself, I found audionotes.ai, whose free tier was enough to experiment with. I’d transcribe on the app, and would copy over into the doc. The output was a lot better - speaking clearly is required but transcription errors occur mostly for names. Also, transcription not occurring in real time surprisingly ended up being better: I would pay less attention to the phone while journaling.
At some point, they added a WhatsApp bot, which is such a little change from opening the app, but it significantly reduced (perceived) friction for me.
This brings us to…
How It’s Going
Every night, or rarely multiple times during a day, I’ll excuse myself for a few minutes and find a relatively quiet, private area where I record a WhatsApp voice message to the audionotes bot. As of recent updates, voice messages can be paused while recording, which means I can record these while walking outside, which I often do, despite occasional loud cars or passersby. I have an audionotes lifetime membership, to avoid limitations on payload sizes.
I speak for usually five to ten minutes, during which I try to reflect on the day on top of chronicling it. The transcription comes back usually within a minute of submitting the message, and I’ll paste it at the top of my Google Doc, under a dated heading. Then, I clear the chat (I am very private about my journal as I don’t want to unintentionally hold back). These days, I often find the time to correct transcription errors, but if I don’t, it’s not a big deal as they are mostly benign. Occasionally, I’ll still type shorter entries on the phone or the laptop, but the above has become the default.
And that’s it! Journaling has turned from a chore that I did because I knew it was worth it to a part of my evening wind-down, and the fidelity has improved significantly in the process.
As for using the diary, so far I’ve mostly used it to look stuff up. For example, if on a recent phone call you told me that your significant other was mad at their boss the other day and might quit, I probably jotted it down here, because this stuff matters once we speak again and I am otherwise poor at remembering it.
The Future
What I want, ultimately, is a diary that (gently) interrogates me - further aiding me in reflection, while removing more friction. As a straw man, a custom GPT whose context includes my journal to date, and which I use in conversation mode (ie. no typing). The GPT would ask about my day and perhaps inquire about ongoing events from past entries. We would spend, usually, 5-10 minutes going back and forth at conversational pace, and the final output of the GPT would be my journal entry, written in a style similar enough to mine. This would also effectively remove transcription errors (the GPT would know when I refer to my kids and how to spell them). I tried to set this up once and realized it wasn’t going to happen yet, but one day this will exist - after all, AI therapists are a thing already and this is similar.
I would also like to interrogate my journal in natural language, perhaps in the same chat. So far, CTRL+F and manual prompts in the doc has worked well enough, but I am noticing that this doesn’t actually search the entire document any more.
As an experiment, I just pasted my entire diary into Claude 3 Opus and it was able to answer questions with what looks like high fidelity, so maybe this is already “solved” for occasional queries assuming a better integration is provided.
The general “AI second brain” space is crowded (rewind.ai, Augment, projects like quivr, including a few specific to journaling) but there are enough gotchas for each of these that I haven’t gotten close to switching.
In the home-grown department, I’ve tried this tutorial which uses RAG over a local vector DB and ChatGPT4, but I found the RAG approach didn’t work too well. Gemini, with its built-in Drive integration, was comically bad. So my strategy for now is to just keep journaling and to mostly rely on CTRL+F but to occasionally copy-paste into Claude.
I assume “diary-like” functionality will be a part of future generations of mobile phone assistants (assuming “pendants” don’t take that market, which I assume they won’t). I’ve noticed that the current-gen Pixel phones already transcribe conversations (even with multiple speakers), and do it very well.
Pet Peeves
… with my current setup.
The bot is down sometimes and it’s annoying to have to deal with it. Once I even went through the trouble of setting up API access to OpenAI (Whisper) directly on my laptop and submitted the recording directly just to get it off my plate.
Audionotes lets you (via a setting) “boost” words that the model tends to mistranscribe (for example, my kids’ names), but it doesn’t work too well, so I end up making similar corrections most nights. The model also frequently messes up the ends of sentences - pause for too long and it’ll insert a period where there shouldn’t be one. In general, it transcribes well enough, but not nearly as well as a person would.
It doesn’t merge multiple voice messages into one, so I stick to the pattern of leaving a single large voice message. If it merged messages received “at the same time”, I could record smaller messages throughout the day (recording just when something happened vs. later) and have a simpler copy-paste into the doc. Of course the copy-paste step could be automated, somehow.
Audionotes also generates summaries of the transcriptions. These are not bad and would definitely work in a work context or when a friend sends you a 12 minute voice message that you just need to get the gist out of. These summaries sound nothing like me, so I don’t use them for my journaling.
Last but not least, journaling is a skill and I’m at most an intermediate. What I record may not always be what really matters. You get better at what you do though, and I’m doing it.
If you found this post useful or think there’s something I should try, I would be happy to hear about it.