In my previous blog post (Are there any language detection tools for assigning language to music data?), I descibed my failed attempts att concatenating Artist Origin (or, to be more precise, the artists origin with respect to the language sung in) to a dataset created from Spotify's Web developer API. This information used to be available... Continue Reading →
Are there any language detection tools for assigning language to music data?
Music is a matter of taste and some of us have....how should I put it? different ideas of what is good music and what is trash that should never have seen the day of light. I am, since a few years back, a huge fan of Chinese Hip Hop and Rap (哈狗帮,龙井说唱 and 龍胆紫 )... Continue Reading →
Working with large csv-files in pandas? Create a SQL-database by reading files in chunks
It is not uncommon to have to deal with for instance csv-files containing millions of rows. Searching, filtering and slicing can therefore be time-consuming tasks. So, the question is then: Are there any ways to speed up the process? If so, this could save a considerable amount of time for any data scientist needing to... Continue Reading →
The Need for Intelligible Artificial Intelligence
Introduction A few years ago, I gave a talk at a healthcare conference organized by Computer Sweden on the importance of AI for the future of healthcare. If I remember correctly, I described a Breast Cancer Detection model I had constructed with the help of annotated data. Some people in the crowd were impressed while... Continue Reading →
Virus Spread Simulation Revisited – Population Attributes and Healthcare Resources
Introduction In my latest blog post (Simulating a Virus Spread – What you can do to help healthcare cope), I described the importance of social distancing in a pandemic in order to minimize the load on healthcare service, or as now is accepted as the concept of "flattening the curve". I choose to revisit this... Continue Reading →
Simulating a Virus Spread – What you can do help healthcare cope
Introduction - or why the net is flooded with the same types of descriptive statistics I've been pushing this moment for quite a while now. Yes, you know that moment when you feel that everybody (and I mean EVERYBODY) has written something about Covid-19 and you ask yourself if you really should partake in the... Continue Reading →
An easy way to deal with Missing Data – Imputation by Regression
Introduction I was recently asked to give a talk for junior data scientists about analytics and machine lerning. As much as I like to talk publicly, I was scratching my head about what I could offer these young minds, at least in novel knowledge. Let's face it: these people are fresh out of school and... Continue Reading →
Gambling with Reinforcement Learning or playing it safe with Supervised and Unsupervised Learning?
I haven't written a post in a very long time and I have gotten quite a few messages asking when the next post will be published. Too much work, a lovely summer and a lot of thinking about deep learning have kept me from writing and it is the latter I wish to talk... Continue Reading →
AI and the value of work
Introduction For those who follow Sopra Steria Analytics Sweden and expect the customary end-to-end manual to some model or technique, this post will be a major disappointment as it is entirely devoted to a discussion on the impact of AI on labor and on what could be called "the value of work". It is my... Continue Reading →
Understanding object detection using YOLO and training for new objects – Part 1
The field of computer vision for the purpose of object recognition is developing at a fast pace. Apart from the obvious examples of self-driving vehicles, there is a wide range of possible applications, such as the field of predictive maintenance of, for instance, power grids. The identification of power lines at risk (e.g. trees growing... Continue Reading →