Contacts in Mac is not much different from iCal. We can essentially export all entries from the Contacts application into a vCard. This resulting *.vcf file can then be parsed and written into the Addressbook PDB format using PERL.
So whats different?
The key difference between Datebk and Address is (1) the availability of “categories” in Addresses, and (2) the higher likelihood of using Unicode characters. Coming from a multi-lingual society where most text are not ASCII, this was suddenly an interesting problem when parsing information in PERL and “depositing” the data into a device that is essentially ASCII only.
Handling categories was relatively simple. We just had to ensure that we were able to obtain the category information from the vCards and match/write them out into the same category IDs as in the PDB. If we can’t find a category entry in the vCard, then we just slotted it into “Unfiled”. The PERL libraries used were Text::vCard::Addressbook, and writing out to the address book PDB format using Palm::Address.
One other thing to note is that there are lots of items in the modern vCard format that didn’t used to be available in the Palm. The simple way to deal with all these was to just have ALL the information parsed and stored into the Notes field in the Palm. That way, I effectively don’t lose any information when looking up data, but also am able to just sort and keep the most important information visible.
As with iCal, this is really a 1-way sync. Changes in the Palm wouldn’t be updated back into Contacts, but IMHO, that isn’t really required.
What about Unicode and UTF8?
This was the part that was interesting. Initially, parsing a small subset of my vCard export , handling categories and notes was successful. But when I finally exported the full data, the Palm crashed. Actually, I can’t really remember if it crashed or not, but I do remember that it didn’t go well. (*I’m writing this about 3/4 of a year I wrote the code). When I finally discovered what was wrong, it was clearly the fact that as awesome as the Palm Pilots were, it didn’t handle UTF8 too well. I had two solutions. (1) Install CJKOS as a way to get the right fonts on my device, and then get the data into the PDB using the right fonts, or (2) erase, negate the UTF8 data so that I would leave behind only the ASCII text.
Both weren’t particularly good solutions as (1) would result in me trying to ensure that the character sets are in sync all the time. Additionally, I couldn’t find CJKOS anymore. (2) was worse. I would lose data. This rendered some of the card information useless, as there are folks whom I only and have information in their native language. I have also gotten used to seeing their names in mandarin. Romanizing their names would therefore not be a good solution.
As I dug around for a solution, I finally came across a PERL module called Text::Unidecode. This seems to be an awesome little module that is able to, in the best of its ability, translate, phonetically, the words in Unicode, into its English, ASCII equivalent. And I’ll say, the results are pretty amazing. Testing on mandarin, it was able to get the Han Yu Pin Yin pronunciations of words I threw at it. Below is one of the examples.
I was impressed.
I immediately got this integrated into every field that I had across iCal and Contacts, and potentially saved me a lot of grief porting data across a modern system and a more vintage technology like the Palm.. which doesn’t understand Unicode.
With two functions down, I only have two more to go. Things and Evernote.
Update 14/1/2019: It seems that I have just came across a small bug that caused Unicode characters to be recognized as gibberish. This seems to be due to some kind of bad encoding on the text string, which my Mac is able to decode, but Text::Unidecode is unable to handle. In order to fix this, all strings are now passed through Encoding::FixLatin in order to ensure that the text is encoding correctly. Once done, Text::Unidecode performs as expected.