Since I have come back to the UK I have started studying Stanford University’s Machine Learning course on Coursera, presented by Andrew Ng. I’m currently three weeks in to the eleven week course. It is well presented, with some good materials and the coursework is challenging me intellectually.
Apart from the above, there are a number of other reasons for me doing this:
- The press on Artificial Intelligence and Machine Learning can be quite breathless and lacking in critical thinking and I wanted to be able to have an intelligent opinion about the wider societal implications of the technologies
- I haven’t been able to find much discussion of the use of Machine Learning on open datasets. My instinct tells me that, as we develop more open data standards and publish data to them at scale, this will change.
- Where I have seen Machine Learning techniques being applied to open data it has made me quite worried. Examples such as predictive policing can perpetuate historical imbalances by sending police to areas that have been overpoliced in the past.
I’m interested in talking about this at Open Data Camp in Belfast this weekend. Some of the questions I’d like to ask in the session are:
- What do people know about Machine Learning and what examples do they have of it being applied to open data?
- Are we going to see Machine Learning applied to more open datasets in the future? If so, which ones?
- Will developing more open data standards and publishing data to them at scale increase the use of Machine Learning on open data?
- What are the ethical considerations for open data professionals using Machine Learning?
I’m aware that I am making some rather sweeping assumptions based on very little knowledge, so I’m eager to hear from people who know what they are talking about, both here and at Open Data Camp.