TDSP Modeling part 2

Microsoft in the News

Are you using Microsoft Edge?  You might want to give it a try.  The standard joke about Microsoft Edge is that all it is good for is to download Chrome or Firefox.  It is going to be very difficult for Microsoft to overcome this global bias, but they seem to be going about it in the right way.  They are offering proof that Edge is the better choice in areas that are important to consumers and business.

For consumers, Microsoft has shown, and separately, AVG has confirmed, that Edge outperforms Chrome in battery run time.  Using three identical laptops, they streamed 720p video using Edge, Chrome and Firefox.  The results:

Browser           Battery Time

Edge                16 hours, 8 minutes

Chrome           13 hours, 31 minutes

Firefox            9 hours, 52 minutes

For business, combating phishing scams is a very important task.  Microsoft has shown, and separately, cyber security firm NSS Labs has confirmed, that Edge blocks 18% more phishing sites than Chrome.  Although business is particularly keen to stop cyber criminals, any consumer with a bank account should also be paying attention.  If you want to continue believing the jokes and rumors on the internet instead of the research, Microsoft still has you covered.  You can download a browser extension for Chrome that will give you the same protection the Edge has.

Finally, of interest to both business and personal users, Microsoft has run an ad claiming that Edge is up to 48% faster than Chrome.  From what I’ve seen on YouTube, it seems to vary depending on what you are trying to load.

In this test done 8 months ago:

Edge has a considerable advantage over Chrome with some websites, and a negligible disadvantage with other websites.

If you are used to Chrome, Edge will seem different.  But if battery life, security, and speed are important to you, you might want to take Edge for a test drive.  Make the switch for 1 week so you can get used to the different interface and I think you will be sold on Edge.

TDSP Lifecycle – Modeling part 2

In this blog we will continue the Modeling topic started in my previous blog.  Specifically, we will be looking at the Model Training and Model Evaluation sections.

            Model Training

Before you can train a model, you must pick the model that is best suited to the question you are trying to answer.  A good place for you to start this process is with Microsoft’s Machine Learning Algorithm Cheat Sheet.  You can download your cheat sheet here:

Choosing the right model could be the topic for a lengthy blog series, so I will not deal with that here.  For our purposes here, we are mostly concerned with the process that a Project Manager would need in place in order to increase the likelihood of success.

Once the appropriate model is chosen, the process for training that model are as follows:

  1. Divide your input data into 2 randomized sets. One will be used for training, the other for testing the model.  You cannot use the same set since you need a unique set of data for testing and ensuring that you didn’t simply optimize your model for the training data.
  2. Using the training data set, build your models.
  3. Test your models. Compare a variety of models, each using a variety of tuning parameters.  Build the models using the training data set and then test the model using the testing data set.
  4. Determine the best solution by comparing looking at the success metrics. Be aware that a common problem occurs when you adjust your model so well that it makes unrealistically precise predictions.  You want a model that works for the general case, not just for your training and test data sets.

The TDSP environment provides you with an automated modeling and reporting tool.  This tool will track your work through multiple algorithms and parameters and produce a baseline model.  It will also build a baseline modeling report summarizing the performance of each model/parameter combination, including variable importance.  Using the reporting tool iteratively, you will gain insights that may allow you to better engineer the features, sending you back to start the whole process again.


These are the deliverables for the Modeling stage:

  • Feature Sets: These will be described in the Feature Sets section of the Data Definition report.  It will include pointer to the code used to generate the features as well as a description of how they were generated.
  • Model Report: A standardized report will be generated that will provide details for each model that was tested.
  • Checkpoint Decision: Here you need to decide whether your model is ready for deployment.  The questions that will help determine this include:
    • Does the model answer the question?
    • Do you have enough confidence in the answer?
      • If not,
        • do you need more data, or
        • should you try a new approach, or
        • can you do more feature engineering, or
        • would a different algorithm be worth trying?

In my next blog, we will look at Deployment.


Leave a Reply

Your email address will not be published. Required fields are marked *