Here’s a good collection of cheatsheets for anyone involved with Big Data and machine learning. Whether you are already well versed in the subject, or just starting, I’m sure you’ll find something useful.
And while we are on the subject of machine learning, check out this repository for examples in Python, theory and math explanations behind many algorithms involved.
It is via this Cyprus Mail article that I’ve learned that not only Cyprus has an official Open Data portal, but that it’s also the best in Europe:
Cyprus is one of the top five European Union countries in the field of Open Data for 2018, while the new National Open Data Portal data.gov.cy scored highest among 31 open data portals in Europe, a special honour and recognition for the Open University of Cyprus (OUC) that developed and implemented the National Open Data Portal in collaboration with the public administration and personnel department of the finance ministry.
So far I’ve only had a quick look around, and I have to say that it’s quite impressive! Even though most of it is in Greek, Google Chrome translation handles it nicely. Here are a couple of interesting bits to get you started:
Metabase is an Open Source business intelligence and analytics tool. It supports a variety of databases and services as sources for data, and provides a number of data querying and processing tools. Have a look at the GitHub repository as well.
And if you want a few alternatives or complimenting tools, I found this list quite useful.
Almost a decade ago, my colleague Deepak Singh introduced the AWS Public Datasets in his post Paging Researchers, Analysts, and Developers. I’m happy to report that Deepak is still an important part of the AWS team and that the Public Datasets program is still going strong!
Today we are announcing a new take on open and public data, the Registry of Open Data on AWS, or RODA. This registry includes existing Public Datasets and allows anyone to add their own datasets so that they can be accessed and analyzed on AWS.
Currently, there are 53 data sets in the registry. Each provides a tonne of data. Subjects vary from satellite imagery and weather monitoring to political and financial information.