Little rant on Microsoft Word 07-03-2019, 09:58 AM
#1
So, at work I'm working on a docx parser to import docx files into our application.
I'm a junior dev and I was assigned to this issue, a total of 100 hours is planned for this issue and thats way too much time but oh well.
So, everything was good until the client tried to mass import custom files that weren't provided by the company, it broke the entire application and import flow.
Sometimes the last text line or element didn't get parsed by my parser, I already knew about this issue but didn't know what was causing it.
So, I started looking into the docx file's source (a docx file is just a zip archive of multiple XML files) and I saw that every time I make my own changes in the Word file it added an element called BookmarkStart with a name of "_GoBack" so I started looking into what this was.
Eventually I found the answer on the Microsoft forum, apparently when you close Word it adds it's own element which will be used when you open the file again.
It gets made to show that alert in Word that can help you get back to where you were (if you use Word a lot you'll know what I'm talking about).
The reason why it broke everything is because a bookmark in OOXML (the style Word uses) also start with bookmarkStart and end with bookmarkEnd. If you don't handle this the right way (like I did), it will cause these kinds of issues and you'll want to blow your head off.
So, it was safe to delete it and that solved all the issues.
I just got so mad at microsoft for making it so easy to fail (by making it a bookmark element) and not make it a new element that will not fuck with all the other nodes.
I know this might be quite hard to understand but it's just a rant and it would take way too long to explain OOXML.
Have you guys ever had such a problem?
TLDR; A stupid feature in Word broke my whole import flow.
I'm a junior dev and I was assigned to this issue, a total of 100 hours is planned for this issue and thats way too much time but oh well.
So, everything was good until the client tried to mass import custom files that weren't provided by the company, it broke the entire application and import flow.
Sometimes the last text line or element didn't get parsed by my parser, I already knew about this issue but didn't know what was causing it.
So, I started looking into the docx file's source (a docx file is just a zip archive of multiple XML files) and I saw that every time I make my own changes in the Word file it added an element called BookmarkStart with a name of "_GoBack" so I started looking into what this was.
Eventually I found the answer on the Microsoft forum, apparently when you close Word it adds it's own element which will be used when you open the file again.
It gets made to show that alert in Word that can help you get back to where you were (if you use Word a lot you'll know what I'm talking about).
The reason why it broke everything is because a bookmark in OOXML (the style Word uses) also start with bookmarkStart and end with bookmarkEnd. If you don't handle this the right way (like I did), it will cause these kinds of issues and you'll want to blow your head off.
So, it was safe to delete it and that solved all the issues.
I just got so mad at microsoft for making it so easy to fail (by making it a bookmark element) and not make it a new element that will not fuck with all the other nodes.
I know this might be quite hard to understand but it's just a rant and it would take way too long to explain OOXML.
Have you guys ever had such a problem?
TLDR; A stupid feature in Word broke my whole import flow.