Neural Networks have made great progress. They now recognize images and voice at levels comparable to humans. They are also able to understand natural language with a good accuracy. But, even then, the talk of automating human tasks with machines looks a bit far fetched. After all, we do much more than just recognizing image / voice or understanding what people around us are saying – don’t we?
Recently, Continuum Analytics, H2O.ai, and MapD announced the formation of the GPU Open Analytics Initiative (GOAI). GOAI—also joined by BlazingDB, Graphistry and the Gunrock project from the University of California, Davis—aims to create open frameworks that allow developers and data scientists to build applications using standard data formats and APIs on GPUs. Bringing standard analytics data formats to GPUs will allow data analytics to be even more efficient, and to take advantage of the high throughput of GPUs. NVIDIA believes this initiative is a key contributor to the continued growth of GPU computing in accelerated analytics.
Deep learning has fueled tremendous progress in the field of computer vision in recent years, with neural networks repeatedly pushing the frontier of visual recognition technology. While many of those technologies such as object, landmark, logo and text recognition are provided for internet-connected devices through the Cloud Vision API, we believe that the ever-increasing computational power of mobile devices can enable the delivery of these technologies into the hands of our users, anytime, anywhere, regardless of internet connection. However, visual recognition for on device and embedded applications poses many challenges — models must run quickly with high accuracy in a resource-constrained environment making use of limited computation, power and space. Today we are pleased to announce the release of MobileNets, a family of mobile-first computer vision models for TensorFlow, designed to effectively maximize accuracy while being mindful of the restricted resources for an on-device or embedded application. MobileNets are small, low-latency, low-power models parameterized to meet the resource constraints of a variety of use cases. They can be built upon for classification, detection, embeddings and segmentation similar to how other popular large scale models, such as Inception, are used.
Previously, in a walkthrough on building a simple application without a database, I touched on randomness. Randomness and generating random numbers is a surprisingly deep and important area of computer science, and also one that few outside of computer science know much about. As such, for my own benefit as much as yours, I thought I would take a deeper look at the surprising complexity of randomness.
It is a well-known fact now that “Data Scientist” is the sexiest job for the current century, but how much does it cost you to prepare yourself to be a data scientist? Are there other relevant jobs to data science and what is the career pathway for this job? What are the essential skills needed to enter this career? Where can we find free courses and what are the Must-Read Books in this field? I will try to answer most of these questions in this blog. I know those are common questions that are mind-blowing for all those thinking to take the data science career.
This set of exercises is about exploratory factor analysis. We shall use some basic features of psych package. For quick introduction to exploratory factor analysis and psych package, we recommend this short “how to” guide.
The LASSO has two important uses, the first is forecasting and the second is variable selection. We are going to talk about the second. The variable selection objective is to recover the correct set of variables that generate the data or at least the best approximation given the candidate variables. The LASSO has attracted a lot of attention lately because it allows us to estimate a linear regression with thousands of variables and the model select the right ones for us. However, what many people ignore is when the LASSO fails. Like any model, the LASSO also rely on assumptions in order to work. The first is sparsity, i.e. only a small number of variables may actually be relevant. If this assumption does not hold there is no hope to use the LASSO for variable selection. Another assumption is that the irrepresentable condition must hold, this condition may look very technical but it only says that the relevant variable may not be very correlated with the irrelevant variables.
I previously made some network graphs of Middle East country relationships here using Slate’s Middle East Friendship Chart. I was thinking of a way to visualize these relationships (and possibly other ones) with more rule based and objective measure over time. What kind of public dataset could show countries relationships accurately? I used weapons / arms trades between nations to explore these relationships. I think arms trades are a good indicator of how friendly countries are together because 1) countries won’t sell to enemies and 2) if a country wants to befriends country, buying weapons are a good way to do buy influence.
In MicrosoftML package comes – in my opinion – long anticipated function for installing R packages for SQL Server and Microsoft R Server. And, I am super happy. Last year, in one of my previous blog posts, I have been showing how to install R package from SSMS using sp_execute_external_script. Now, with new package MicrosoftML (that is part of Microsoft R Server 9.x and above) new function is available that enables you to easy install the package and also little bit more.