Validating CSV schema

CSV, or comma-separated values, is a very common format for managing all kinds of configurations, as well data manipulation.  As the linked Wikipedia page mentions, there are a few RFCs that try to standardize the format.  However, I thought, there is still a lack of schema-type standard that would allow one to define a format for particular file.

Today I came across an effort that attempts to do just that – CSV Schema Language v1.1 – an unofficial draft of the language for defining and validating CSV data.  This is work in progress by the Digital Preservation team at The National Archives.

Apart from the unofficial draft of the language, there is also an Open Source CSV Validator v1.1 application, written in Scala.

Do you know YAML?

I thought I did.  Especially after all the hours spent with Ansible.  Turns out I don’t.  I have a very limited understanding of the YAML format.  How do I know that, you ask?  Well, that’s because I am reading the YAML specification now.


Holy Molly that’s an interesting format!  Much recommended weekend reading.

GeoJSON – an open format for encoding a variety of geographic data structures

GeoJSON – an open format for encoding a variety of geographic data structures

Looks handy.  Learned about it while reading the GitHub blog post on announcing the support for interactive display of GeoJSON files in repositories.

JIF, not GIF

Mashable reports that Steve Wilhite, the inventor of GIF (Graphics Interchange Format), during the lifetime achievement ceremony insisted that it’s pronounced “JIF”, not “GIF”.

JIF, not GIF


[rant mode on]What?!  “GIF” is not “GIF, but “JIF”?  Non-sense!  It’s give, not jive.  Girls, not jirls.  Gift, not jift.  And even if he believes that that’s the correct way to pronounce it, how irresponsible is that to attract attention to this issue now?  It’s just like throwing a barrel of petrol into the pronunciation holy war flames of GNU, Gnome, Gimp, and other pillars of Open Source Software.[rant mode off]