Finding a good data structure for a word processor is a difficult problem. My notebook diaries on the problem go back 25 years when I was frustrated with using Word for my diploma thesis - it was slow and unstable at that time. I ended up getting pretty hooked on the problem.
Right now I’m taking a professional break and decided to finally use the time to push these ideas further, and build MiniWord — a WYSIWYG word processor in Python.
My goal is to have a native, non-HTML-based editor that stays simple, fast, and is hackable. So far I am focusing on getting the fundamentals right. What is working yet is:
- Real WYSIWYG editing (no HTML layer, no embedded browser) with styles, images and tables.
- Clean, simple file format (human-readable, diff-friendly, git-friendly, AI-friendly)
- Markdown support
- Support for Python-plugins
Things that I found:
- B-tree structures are perfect for holding rich text data
- A simple text-based file format is incredibly useful — you can diff documents, version them, and even process them with AI tools quite naturally
What I’d love feedback on:
- Where do you see real use cases for something like this?
- What would be missing for you to take it seriously as a tool or platform?
- What kinds of plugins or extensions would actually be worth building?
Happy about any thoughts — positive or critical. Greetings
This is a famous "killer" feature from WordPerfect: the ability to view and edit the low-level formatting for a document. It's invaluable for fixing weird bugs.
However, it works only because WP uses the "text-stream" paradigm, where a document comprises a linear stream of text with formatting codes (Bold, Font, Hard Return, etc.) embedded directly at the point at which they're applied.
In contrast, Word uses the "nested containers" model (characters inside words, words inside paragraphs, paragraphs inside sections, etc.), where this feature can't be replicated.
I didn't look closely at your code, but just thought to mention this feature.
Very nice! Unfortunately, the UI menus seem to be broken when using a dark-mode GTK theme (e.g. Adwaita Dark).
I still don't understand why people still use ~~Microsoft Word~~Copilot document writer , I think they have gotten into some weird mindset that their documents require all this weird unnecessary formatting to look "official"
Also wysiwyg doesn't mean it can't be back and forwards compatible with markdown, it might just mean that it's a markdown editor gui with a preview.
This annoyed me until I realized pandoc supports separating [the link text] from the link location.
[the link text]: </url/to/resource>
"`title` parameter of the <a> tag, if converted to HTML"Not for a layperson. There’s a reason WYSIWYG word processors completely obliterated the previous “needs an explicit preview mode” generation ones.
I was kind of also wondering something like this as I read about different countries switching to linux, and them needing overly complex office software because they are entrenched in the thinking that that need Microsoft office.
Why do you NEED an office clone, what is it in your job that requires anything more than simple text and formatting that something like markdown provides.
I always envy people that can use computers as tools (like scientists/math people) and not fancy distraction devices. Those people, from what I see, don't care about the os, what it looks like, etc... they just want to use the computer as a tool to help them solve problems.
on a third tangent from the point, once I was given a PDF of data to process (instead of just the csv) , because people don't understand computer formats, and try to use things that they think make them look "professional"