[INTELLIGENT MACHINES]

SeeThings, Shazam for things.

Description

" what if i told you there is an app in the market"<br> -jian yang

Silicon valley is one of the coolest tv show right now in the sense of its too acurate although a bit exaggerated portrayal of software developers.


Particularly one of the most funniest episode was an episode where they create an app which uses machine learning to classify food, but our character jian yang builds a not-hotdog app which just classifies food as being hotdog or not hotdog. Jian yang has always been an odd character and kudos to jimmy o yang for portraying this but this inspired me to build my own app that classifies things.


Having worked with android studio for building a simple attendance and notes manager i always wanted to build an app on react-native which recently has turned out to be the go to method for building scalable apps fast so what better way to build an app that predicts things as my first app. I got inspiration for this by reading this article from the dev who made the app for the show.



SeeThings v1


Since i wanted a general purpose classification app so what better network to use than a pretrained network on a 1000 categories already. I had a vision for it in mind, for it to be as closely resembling to the not hotdog app but still having its originality. I used a pretrained MobileNet from keras and wanted to use it along with tensorflow js but tensorflow js wasnt supported for react native. Then, i moved on to using react native tensorflow, a wrapper around tensorflow lite module but it wasnt updated for the recent react native version and uppon recieving no updates from the community, i decided to move on to using a server based solution for solving this problem.


As of now v1 of the app uses server side hosting of the model on heroku to predict objets. which at times can be a cool game just to see what the model saw in the image to classify that object.
Edit : I recently, had a talk with Tim anglade, the developer of the app and he suggested that i make a javascript wrapper over tensorflow lite java implementation and use my model directly offline instead of server. So v1 will see quite a few changes in both overall working and ui.

A Very quick Overview of how i made the app:


  • I initially decided for how the app was supposed to work, there were 4 screens i had in mind, the start screen, the camera screen, the result screen and the past results screen
  • used a pretrained mobilenet model on keras and hosted on heroku using flask (will move on to a better server in v2 or offline support).
  • initialised components for each screen and coded the associated behavior that every part is supposed to do
  • used react-native-camera module to create a new camera screen with options like flash and swapping camera and auto zoom
  • passed the image using base64 from client side and display the image along with the results on the result screen.
  • wused the past results screen to store the previous results and share the results ( to be applied in v2)
  • mapping emojis to all the categories( to be applied in v2) and displaying it along with the object class.
  • wuse different models and get a good tradeoff between size and acuracy.