Skip to main content

The End of Apps — Kitze, Sizzy.co

TL;DR

  • The speaker's lifelong quest for an ideal productivity system evolved from simple to-do lists to complex, custom-built "life OS" applications like Benji, aiming to offload thoughts and manage all aspects of life.
  • Initial excitement about AI agents and GPT plugins for personal productivity was dampened by challenges such as system unreliability, context loss, UI friction, and the overwhelming complexity of orchestrating multiple specialized agents.
  • The future of productivity is envisioned as an inversion of the current AI prompting paradigm, where an AI "life OS" proactively prompts users for decisions, delegating 99% of tasks, with local, privacy-focused agents potentially becoming the dominant solution for everyday users.

Takeaways

  • The speaker built various personal "life OS" tools, including "to do do" (priority-based to-dos), "better" (to-dos, habits, planner), and "Benji" (an all-in-one app), due to dissatisfaction with existing productivity apps.
  • Early personal productivity systems involved using text files with an Android app called Tasker for contextual reminders and IFTTT integration with Google Assistant for offloading thoughts.
  • Significant friction exists in data input for productivity tools, leading to cycles of intense usage followed by complete abandonment.
  • The speaker experimented with Claude Code (Anthropic's Claude) for tool calls and personal skills, eventually adopting Claude bot as a conversational interface for their personal assistant.
  • Self-hosting personal data on devices like NAS and using local markdown is a key strategy for agents to work effectively, emphasizing data ownership and privacy.
  • Specialized agents, each with a defined purpose, model, system prompt, tool list, and permissions, are preferred over a single, overloaded agent for managing different life domains.
  • Current conversational interfaces like Discord and Telegram are not ideal UIs for a comprehensive life OS due to limitations in managing complex tasks, context, and multi-agent interactions.
  • The speaker's "Wolfer" experiment addresses agent reliability and memory by using nested topics to inject context from parent topics into conversations, rather than relying on a separate memory system.
  • A future where AI agents proactively prompt users for decisions, delegating most tasks and generating UI on the fly, is predicted to replace most consumer apps for "normies."
  • Local agents on platforms like Apple and Google Pixel are anticipated to become powerful and private "life OS" solutions, leveraging device capabilities without Claude dependencies or credit costs.

Vocabulary

Tasker — An Android application for creating custom automated tasks based on specific conditions or triggers. IFTTT — "If This, Then That," a web service that allows users to create chains of conditional statements to automate tasks across various apps and devices. Life OS — A holistic personal operating system designed to integrate and manage all aspects of one's life, including tasks, habits, schedules, and information. SaaS — Software as a Service; a software licensing and delivery model where software is licensed on a subscription basis and is centrally hosted. JSON — JavaScript Object Notation; a lightweight data-interchange format that is easy for humans to read and write and for machines to parse and generate. API calls — Requests made by a client application to a server's Application Programming Interface (API) to retrieve data or perform an action. Tool calls — A feature in large language models that enables them to invoke external functions or software tools to execute specific tasks or access real-world information. Self-hosting — The practice of running web servers, applications, or other services on one's own privately-owned hardware, rather than relying on third-party cloud providers. NAS — Network Attached Storage; a dedicated data storage server that allows multiple users and client devices to access and share data over a network. Cron job — A time-based job scheduler in Unix-like operating systems, used to schedule commands or scripts to run automatically at specified intervals. LLM psychosis — A colloquial term describing the phenomenon where Large Language Models exhibit unreliable, forgetful, or repetitive behavior, especially in complex or extended interactions.

Transcript

Back to you. Those are not my slides. There you go. Hi, I'm Kite. We probably argued on X. I'm Dekite on X and I turned 34 years old today. Decided to do a talk on my birthday because of the party. Thank you. I'd like to torture myself by asking, do we have anyone from Tinker Club here? Please? Just a person sleeping in the back. He's like, oh, what did he ask? All right. I formed this recently. I'm a home community where every person inside is copy and paste of everyone inside is hilarious to see if you want. You can join us. So I'm going to talk today about the past present and future of productivity and personal agents starting with my first to-do app was when I was 10 years old, which is crazy. I found an old note in a notebook and some scribbles that are barely legible like, I need to eat my string juice today. I don't know what a 10-year-old does for a 2-do list, but it clearly had check boxes and I've been trying and wrestling to solve productivity since then. I was anyone else forever unhappy with 2-do apps? Please. There's no perfect thank you. Thank you. It's not only me. So I tried, this was probably 15 years ago. I got so fed up with the two do-its and the other ones that I started using text files way before all of these local markdown blah, blah, blah. And I used an Android app called Tasker to basically manage all of these text files. I got contextual reminders like whenever I connect to Wi-Fi, remind me about something or when I arrive at a destination or when I bike or blah, blah, blah. So I was always trying to figure out a productivity system. I had like a Google home which supported back in the day, IFTT supported to basically cut the command in half. So when you say, tell my assistant to, you can take the second half and send it to any of the IFTTT services, which was pretty cool. At any time I would have it to do around the house, I would just tell my Google assistant and it would just store it. It wasn't smart. It wasn't AI, but I was building towards something where I can offload my thoughts and process them in a week. I realized that I never wanted to do app. I wanted like sort of like a life OS. So slowly I've been going to that direction. In 2016 I am bad at naming, so just ignore the names of everything I ever built. So I made something called to do do, which was like a to do app. But all the to do is like shoot up to the top based on like a priority system. So if you tag them with something called health or crisis or whatever it is, they would just accumulate all of those points and shoot higher to the top. So it was kind of helping me to prioritize things. ADHD hit, of course, and I forgot about that one and I start something called better. This one was kind of hard to SEO because good luck figuring out SEO for better apps, so eventually I had to rebrand it. But expanded here by adding to do habits, plan our events and a bunch of other things because I realized if these three are not together, I can never make like a mini OS. Then of course ADHD hit and I switched to a bunch of other apps. And in 2022 I started making Benji. It's named after my dog, my dog is a mascot. That's not the logo, but the point is I wanted an app to rule them all. I might have went a little bit overboard so the next slide you're going to see. You're like, oh, probably here are the routines and calendar events and like what else? No. This is how much I hate marketing. If you're like, wait, I've never heard of Benji. How come? Because every time I had the urge to do marketing and to actually promote these two people, I was like, maybe one more feature. Maybe one more feature. It's like almost like three, four years later and I still haven't properly wrapped this up. It's still not properly finished. But I was frustrated with using a web app for one thing on iOS app for another thing. It supports this. It supports Android. It doesn't support this. Some of them are subscriptions. Some of them are premium. I just wanted all of these features like mangle into one tool that can sort of fix my life. Has it? Absolutely not. What are we going towards that? My vision is to one day have like a Benji phone and a Benji OS. The funny thing is I set this on a podcast and the guy was like, very ambitious for someone who doesn't have a landing page for Benji. So I didn't have a landing page, but one day I'm going to make like a Benji phone. So the friction with having like making this life OS, whether it's in no shine or in something else like Benji, the annoying thing is you have to use forms to input data. So I oscillate between two states. I'm either for months like locked into Benji and logging everything and doing all the things or I completely ignore it. I don't care about what things are there to do nutrition, whatever. I'm like, no, no, no, I don't want to look at it. And then in a few months I'll go back into that cycle because there's a lot of friction in using all of these tools. We had the chat GPD moment. It was awesome. But when GPD plugins came out, I don't know if you remember that ancient relic that they now it's transformed into mcp's and whatever. I called my wife and I was like, honey, it's over. It's over for all the apps, for all SAS, like GPD is going to eat the world. It's all going to be chat GPD. It's all going to be within the thing. Benji is pointless. I wasted years on blah, blah, blah. Three years later she received so many of these calls. She just ignores me at this point. I'm like, oh my god, they dropped a new opus. She's like, uh-huh, cool, cool. Nothing ever happens. But we're going towards this. Like 2023 before the models could return JSON. You had to bully the models to return JSON. I don't know who remembers this. Like you have to be like, please don't write any mark down. It's like, sure, here's some JSON. Like no. So you had to parse it, to cut it, to shape it into form, to make some JSON. And I added a feature in Benji where you can press a key on your keyboard. It would record with a microphone. And as I was speaking, it would periodically cut some of what I was speaking and basically call, it wasn't MCP, it wasn't anything. It would call APIs in Benji. And you can see your calendar moving live and you're to-do's and everything. And to people on Twitter, this was mind-blowing. It's like, holy shit, dude, you should pursue it. You should make something out of this. But ADHD was like, no, no, no. I don't like it. It went viral, which means we never have to talk about this again. So the Benji has system still hasn't shipped. And I did nothing about it. Meanwhile, people took one feature of Benji, which is like, I don't know, food tracking. They take a picture with your phone and it analyzes calories. And they made multimillions. But I have 60 features. There's a lesson in there. So last October, I realized that, wait, I'm using Claude Code. I can use it for more stuff. Like it has tool calls, functions, and a bunch of other stuff. Maybe I can tell it to do my taxes and end up in jail, hopefully not. Maybe I can tell it to organize my email and my to-do list and a bunch of other things. So when skills came out, I started loading my Claude Code with personal skills. But I'm like, wait, now I have coding skills, I have personal skills, it gets confused. I started asking people, how do I go and make this into a proper assistant? This lives on top of Claude Code, but it has tools for other stuff than coding. But ADHD was the color we forget about this. Let speed come up with the Claude bot and everything else. You don't need to worry about this. The Claude Code had the wrong shell for me because it was terminal based and it craved for something else. So when Peter made Claude bot back then, when I saw the tweet, I'm like, oh my God, you can talk to it through WhatsApp or Telegram or whatever, for me it was like, that's the moment. That's what I needed for my Claude setup to actually evolve into the next thing. My brain caught on fire. I think we got mass psychosis, it turned into a cult, everyone wearing lobster suits. It's been crazy for a while. And I joined a Discord. And it was less than 100 people who had their Claude bot setup. Even Peter was like, how did you do this? There's no onboarding. There's no like, how did you do it? And I told him what I'm telling people now, I don't know how the internals of my setup work. I just ask either Codex or Claude Code to fix it, to change it, to improve the memory, to do this, to do that. But I have no freaking idea. People are like, what do you have in your JSON file? I'm like, I haven't seen a JSON file since four years ago. I don't know. Just ask my bots and it just fixes the things. So for a while, I went full lobster mode. This is me at the first meetup in Vienna in a lobster suit. I made that logo, actually made the Open Cloud logo at 2am at night. I started wearing all of these lobster merged in tutorials, podcasts, guests, talking about all the use cases and blah, blah, blah. And finally, what I liked for someone who's been obsessed with to-do's and productivity since 10 years old. I'm like, the future is finally reachable. All my files from Google Drive and I Claude in presentations, I have in photos from high school and all the things that I have piled up and unfinished business ideas. I could see how Open Cloud can just magically wave the lobster hands and just fix everything in my life. So I was immediately done with all the Claude models. I went full hipster mode. No more Gemini, no more chat GPT, no more Claude. I wanted to fully, I got the power of finally owning the assistant, owning the files, owning the memory, deleting the sessions if you wanted to. So it felt fully local. So naturally, I started preparing all my data for agents. I went from the guy who was like always using Claude and stuff to annoyingly self hosting everything. Everything has to come off the Claude. It has to be local on my NAS and my machine just so my agents can actually work on it. So these are still working progress. The classic working progress is going to finish one day. But I started moving to local hosted, like next Claude image, local markdown for everything that requires a lot of API calls or MCP and whatever. I would rather just have it local to work on all of this locally. I went that far and I went back to Android. I feel like this thing in a way you know like enchanted me. I'm like, who am I? I don't recognize myself anymore because I want it. My agent to be able to read my notifications, clear my notifications, install apps, uninstall apps. It can do anything on an Android phone and on iOS, it can maybe send you a push notification. And if Tim Cook allows. I was planning to do like 10, 15 more slides, sorry for the flashbang there of use cases, but then they told me the presentation is exactly 18 minutes. So I did that one. It's on YouTube. It's on a bunch of podcasts. Probably all of you have maybe even more use cases than me. But when we do, we do weekly meetups in the tinker club and we talk mostly about open claw. And I love to ask this question. When I ask them about which use cases do you have then ask them, but which ones of them you cannot do with Claude Code and with Codex. And immediately it just reduces by 90% because it's like, yeah, I can kind of do that with Claude Code. So I've been also asking myself, like what is the value of having like a package agent like open claw. I think that one-on-one chat with one agent sucks because if you think about delegating in your life, if you have like business and personal and family and blah, blah, blah, you don't want to have like one employee loaded with all the information about your life talking in like telegram in a one-on-one chat or everything. So more people started using telegram topics. They started using Discord Slack and other stuff just to get organized. I like the idea of specialized agents which open claw supports, but not a lot of people use them because basically they have like provider model, level of thinking, a system prompt or a soul, a list of tools and MCPs and a list of permissions. I like that this is like package and we're going to talk with this agent about fitness. Now people talk about LLM psychosis. I'm out here like going crazy. Like these are all of the bots that I created and I try to like contain every bot to have a purpose in my life. Like some of them are for work, some of them don't take photos with my chats. So now I ended up with, I have five disc, the funny thing is like, as I keep talking, keeping in mind that my life is far from solved. It's never been work chaotic. I've never been late on rent, on a mortgage, on like customer emails. It's a mess. But it's a performative mess. Right? So I ended up with five disc, and each disc has many channels and threads and forum posts and nested thingies and blah, blah, blah. And then inevitably, I mean you can sense this across the community. I sense that across tinker club because in the beginning it was an explosion of signups of people joining the meet up. They're like, oh my god, weekly cause, we're going to crush the world. And now if you enter a meet up, now it's like five people and it's slowly turning into like open claw and anonymous and everyone was like, yeah, mine didn't do it like the fucking cronjoss man, they drive the fucking. So it's been depressing, but I think we'll bounce back, we'll figure it out, like you know, we'll figure it out. Why is this happening? Because it was and kind of is for me unreliable, where it matters most, which is like cronjoss multi agents, the agents talking to each other, the agents forgetting like literally in the next message, they're like, ah, what are you saying? And I'm like, the message is above you, just go, one message above you. This is yet in fixed and it's getting updates every day, but I've yet to see that it's actually you know working. This is not the open clause or any other agents fault, but this code and telegram were not meant for a life-wise. We're just molding them into something, but they'll never be the right UI for you to manage your life fully. It's like coping away until we get to something else, we're going to use this code of telegram. And finally, as I would like to call them, and thropping, they ruined the charm of it. Like as soon as you pull the model, talking to GPT-5, it feels like talking to a box of oats. Seriously, it has the personality of this. Try this. You're like, okay, did you do that? No, but I told you to do it. Okay, I'll do it. Did you do it? No. Every conversation with open clause looks like that in the last and it drives me nuts. So what now? Where do we go from here? I don't know how much time I have left. It says six minutes. Where do we go from here? I see like two futures like fighting for each other and I don't think that either of them is going to win in the long run. Do we have these custom agents like open clause, Hermes or whatever else is possible? And we have Claude agents because everyone is trying to grab a slice of the pie now. We have co-work and open AI is going to have a thingy and perplexity is trying to make a thing and everyone is trying to make their Claude thing and those are the Claude ones. So the custom ones are never going to work because they're for tinkerers and I'm telling you, like in tinkerer club, the people, we have people who are building their own pinball machines. Talking about tinkerers, like they tinker with everything. And then it's freaking tired of like trying to make this thing work. Let alone people who have lives, let alone people who have like busy lives and jobs and whatever else. No one will have time to tweak this. They would just like a served solution for them so everything works out of the box. Not me. I'm not going to be happy until I, you know. And then Claude agents, I tried Claude co-work for like five minutes and I'm like this is too nerfed. This is not an open clause alternative. It cannot do like even like five percent of the things that open clock can do. So this is will be for the masses and but it won't satisfy the tinkerers, the people who want a self host, own the models and blah, blah, blah. So two directions here. What am I going to do like personally for myself and what I think is going to happen next in the actual like industry. I'm juggling currently between open clause, Hermes paperclip is anyone using paperclip is like kind of this cool like Kanban linear like thingy for agents wasting a lot of credits. I'm trying plain t-mugs with codex a lot of time. When you reach the peak frustration with the first three, you're like fuck it. When you open the terminal, you're like maybe the agents are not that smart. So I'm juggling between all of this and I'm using all of them daily but it's like the hesitation that I have like I wanted to see where the location for the venue is and I had two options, open the website or go to Discord and I'm like I don't want to talk to that box of oatmeal, you know. It's going to be like yeah I'll find a location in your email, did you? No, are you ready for it? It keeps asking you are you ready for the thing you told me to do? It's crazy. So I started making my own thing naturally you can see the progression. It's never going to see the light of day. It's not for people, it's just an experiment to do it for me, I call it a wolf. And I'm not making it for mass appeal, I'm not making it for everyone to use it. I'm trying to like how can I make a tiny abstraction on top of like codex or Claude Code resting piece. I'm afraid to use Claude Code because I might get arrested. So it's only on codex for now. And it's not extensible and it doesn't support a billion providers. So I'll start with the cons, what sucks? You're forced to use the UI chat of the actual app and you cannot use telegram or I message you whatever. There's no support for any of this. It's absolutely the opposite of open client, Hermes. It's not built with plugins in mind. It's the idea is to have everything in it. There's no memory system. I'm not really selling the thing. But none of these things are out of the box. It's not very modular. It's made by an ADHD squirrel brain that we forget about at the end of the month. And it doesn't have open eye funding and it doesn't have a cool officer load. These are the cons. But the pros and why I would suggest all of you to maybe dabble with this and try to make your own or maybe eventually try mine if I ever released it for people. It has predictable conversations and the UI that I made, you go to the Wolfer app, like Wolfer.whatever the URL is. And it has predictable UI that's made for multi agent orchestration into multiple topics, multiple conversations, like everything was made for this purpose. It's not like you're taking discord and you're trying to mold it to be for a certain purpose. And my favorite feature is because I don't believe in memory of agents. People are like, oh, we finally saw Mila Yov-Witch solve memory. I'm like, no, absolutely she did solve memory. What I believe in here, I have nested topics. I have like work projects, Benji customer support. Let's say that's the nested tree. And when I'm talking to Benji customer support in the first prompt, it injects the description of all the parent prompts. So when I'm talking to Benji customer support, it doesn't need to pull from memory or some magical place. It just looks at the topic, the parent topic, the parent topic, the parent topic. It takes all the descriptions together and it immediately knows what is my work, what is Benji, what are my projects, and how do I do customer support. And I can get more out of that than hoping from some memory system that's going to pull the right context out of the right place. It kind of works for me. It supports work spaces, I can switch between work spaces. I hated the equivalency tool costs. I would like to see tool costs to collapse them, to uncollapse them, to see loading spinners. There's buttons for stopping the thing. I don't need to use slash commands. The cron jobs are predictable. And when you get a cron message, it actually reads from the entire conversation and it labels it as cron. It's not where did this come from and why is the agent lost. There's UI for managing agents, which is for my brain, I really needed. When I chat in a topic on the right side, I see that the agent is Chandler and he has this model and this capability. It really helps me to know who am I talking to and just tweak, and be like, no, no, no, you don't need that capability. Boom, it disappears. I would have included screenshots, but the app didn't work because it's on my Mac studio at home. It's a long story. But imagine the screenshots, it's kind of cool. And I like that you can, there's like a knowledge base and documents that you can write mark down documents in the thing and you can add them because in Discord you can only add other members. There's no dynamic add to mention something else. And here I can mention, for example, hey, let's fix the landing page of Benji just like, and then I will add the landing page of thinker club, for example, or I can add a knowledge base or a password or a skill so I can combine multiple ads so I give a right exact context that it needs for the actual thing. But I think it's going to happen next, because this is definitely not going to be a mainstream thing. What's going to happen next in the entire agents and industry and what are people going to do this my prediction? I think the way we use computers right now is absolutely insane. Does anyone agree with me and have you finally got this like when you open your computer, like computer shouldn't be this way. One person, two, okay, we have a lot of people. Like I opened my computer after a few hours, it greets me with 17 updates for apps I haven't used in a while, and it greets me with like tabs that I had open since yesterday. Like how I imagine in the future, it would need to ingest all the information about my life like notifications and emails and everything that's happening in my life. And depending on how far away I've been from the computer, it should greet me with the next task to work on and then the next one and the next one. And it should maybe give me a break and be like, hey enough, let's do this, let's do that. So in a way, I think the role of AI is going to inverse so the way we prompt the AI right now, I think it's going to inverse and the fully productive people will be the one who delegate 99% of the stuff to the AI. And then the AI prompts you, it's like, hey, you didn't send me a picture of your passport or, hey, what do you want to do? You basically do decisions and you basically click like forms or you answer questionnaires or whatever it is. But in the background, there's something constantly working for you instead of you prompting it all of the time. I agree with this sentiment, people are like, put my grandma will never wipe code. That's 100% true because I think where we're going, we're actually not going to need most consumer apps. Know your grandma, your mom or your friends are not going to wipe code, but they'll be able to sit in this new futuristic OS and they'll be able to do any task that they want to do. Like either the UI is going to pop on the fly or whatever it needs, but they'll be doing tasks and they forget about, I need an app to do a task. They'll just do it. A small set of apps will survive, but it will be software for like specialists and people I don't know who are doing like color grading or some movie making or music making where they actually need a software. But normies will just chat to their computer and their computer will do things and the UI will generate on the fly. I also think it would be the funniest thing if Apple wins all of this because local models are getting insane, really good. And they're going to get even better this year and next year and I think for most normies for most people, they'll be completely fine with a local agent like Siri, getting toolkit abilities from all of the local install apps, not wasting any credits, their data doesn't go anywhere, and their phone magically is doing things. The latest Google Pixel can already launch your apps in the background and order coffee and do a bunch of things for you. So I think that's where everything is going. So I'm over time. Thank you for listening to my rant. Hopefully we can discuss afterwards. And thank you very much. Thank you. Thank you.

Feedback / ReportSpotted an issue or have an improvement idea?