This will be the last blog post dealing with data science from a general perspective. Today we will look at steps 6 and 7 in the DS diagram
- Answer the Question
Now for the fun part: machine learning! Or not. The reason I say “or not” is because now that you have your data organized, you may be able to simply graph it and once it is in a form where you can see the whole picture, the answer may become obvious. Again, using the example of trying to figure out how long a warranty should be offered on your product, once you see that the life of the product correlates with the temperature, you may decide to offer one warranty in continental US, and a different warranty in Alaska. Not much need for the application of machine learning here.
On the other hand, if the question is more complicated, and machine learning is going to be helpful in seeing through the fog, then it is time to start searching Azure’s machine learning catalogue and find the right tool(s) for the job. If your data set is small, you may find yourself using optimization to find an answer. However, machine learning works best when there is a plethora of data. Having lots of data leads to answers you can feel more confident with.
If you have assumptions that you feel confident making, then you may be able to get away with a small data set and still feel confident in the result. For example, if you are trying to calculate the time it takes for an object to cool, you can take a large set of data points, calculate the rate of cooling, and then apply that rate to a specific case. Or, you can apply some known math to the problem (the assumption) and use it to answer the question for an individual set of data points.
If none of these actions gives you confidence that you are getting the right answer, that is a strong indication that you need to slide back to square 1 and gather more data.
- Use the Answer
This is where the rubber finally meets the road. You now have an answer to the target and that answer needs to be put to good use. Publish, blog, tweet, share, present, or whatever else you can do to get some action taken based on your findings. Sitting on your results or not fighting to get it recognized is a recipe for languishing in a career. Turning it into a presentation for fellow geeks is an excellent way to put a little “BAM” into your career.
Now, if you are a true data scientist, there is still one more step: Get more data!
And that is all for the side bar. In my next blog we return to the TDSP framework provided by Microsoft.