Milestone Systems Vice President APAC Benjamin Low shares his smarts on shallow and deep learning and argues that machines can be better at low-cognitive tasks, and, often deliver a better quality of service than humans.
By Benjamin Low
Artificial Intelligence is an all-encompassing category covering various things, including a wide range of neural networks with different capabilities. Neural networks are segmented by how they approach a particular, unstructured data set or problem, either with a process, algorithm or machine learning approach.
Due to the previous limitations of hardware processing power, machine learning could only deploy shallow learning of very large data sets. This shallow learning looks at data in just three dimensions. With recent, significant advances in processing power of graphical processing units (GPUs), we now can utilise a deep learning approach where we can look at data in many more levels or dimensions – hence the word “deep”.
This new GPU-compute platform can be adopted by re-coding software to use a new type of coding called parallelisation. Software parallelisation is a coding technique for breaking a single problem into hundreds of smaller problems. The software can then run those 100 or 1,000 processes into 1,000 processing cores, instead of waiting for one core to process the data 1,000 times.
With parallelisation, there is a quantum leap forward in how fast we can solve a problem. And the faster we can solve a problem, the deeper you can go with a problem or the deeper the data sets can be processed.
IoT Frameworks and Aggregation of Data
The focus for video management platforms would be to continue enabling more and more IoT devices, such as IoT enabled cameras, across different frameworks into a common data center. Video management platform providers will continue to advance GPU technology to create a whole new level of processing, helping companies that are using GPU as a parallelisation – like BriefCam – to run those functions right on the unused GPUs already in the hardware.
NVIDIA, who invented the GPU, is driving machine-to-machine communication at an exponential rate. One of NVIDIA’s GPUs offers 5,000 cores – meaning 5,000 problems can be processed in a nanosecond – and we are hardly using any of those cores yet! We are decoding the video and detecting slight motion, but then there’s still a significant amount of resources available.
By allowing companies to plug right into that pipeline, the video management software (VMS) can process all of their data without having any hardware plugs. The processing power is there. Then we can extract all the middle data out of that, have that aggregated and start creating automation. From there, we can create new types of visual presentations for this information.
Advanced rendering is about creating a whole different type of mixed reality. Some data are artificial, some data are real. As humans, we will use both to create a more interesting and useful picture of a problem. The BriefCam Synopsis system is an example of mixed reality. It uses real video, extracts objects of interest and then provides an overlay of augmented reality. Humans cannot look at 24 hours of video in nine minutes. But with Synopsis our intelligence can be augmented.
Actualized Potential of Augmentation
AI and machine learning are being applied for AI-enabled devices and machines to get very good, low-cognitive functions. For example, humans cannot sit and watch all cameras simultaneously, all the time; our attention does not work that way. But machines are extremely good and detailed at this. We do not see pixels, we see objects. The machine sees the most finite detail available to it, which is the pixel, and within the pixel, it can see more details, which are the shade of colours of that image.
By aggregating data, allowing machines to automate responses and solutions, we can augment human interaction and our environment.
Everything is about to change. Just in how people review and utilise video and data, we are going to see massive advancements. Imagine an interaction between a near-eye lens, medium-distance viewing glass and large video screens. There may be an overlay of detailed text data on the small lens, augmented video in the medium distance, with the big scene view on a large screen. The live video, augmented visuals, and text data will work in concert. When looking at a large scene, data will change what you’re seeing in the near-eye screen. With this intelligence augmentation, the system will know that you’re looking at a face or building or license plate; it will help to figure out who or what you’re looking at and show some related information. That is actually all possible today.
A Visionary Model
AI is fast becoming an important asset for ensuring security in smart cities, such as Singapore. AI technology is already being integrated within Singapore’s Home Team border security and homeland security applications. Coupled with video capabilities, this means that law enforcement agents are able to make better decisions through AI-driven perception, processing, and analysis.
Examples of video and AI integration can be seen in law enforcement applications across the other side of the globe – the City of Hartford in the U.S. is a great example of technology used as a force multiplier. Milestone and its partners have worked with the City of Hartford C4 Crime Centre in creating an intelligent city beyond human capability. The Crime Centre uses BriefCam Synopsis technology with the Milestone VMS platform and other analytics like ShotSpotter and Hawkeye Effect GPS location. It also uses Axis IP cameras and all the devices it has aggregated in Hartford to solve crimes that it could not solve before.
Not only are many more crimes therefore now solvable, but also the crime centre does not have to spend 30 hours doing low-cognitive, manual tasks, like freezing on a rooftop to watch a drughouse all day and night. Officers can now sit at their desk and within just a few minutes, know exactly where a drughouse is by seeing an augmented reality of foot traffic over time, accordioned into useful overview.
Officers can simply go into the data and extract the problem. That precision and efficient use of resources is a game-changer in how we as humans will work in our normal jobs. This is just one of many examples that we are seeing as we identify and address the problems we need to solve using new technologies.
An Intelligence Revolution
Having machines take over low-cognitive tasks will be the big trend for years to come. With proper aggregation of information, machines can be better at low-cognitive tasks than humans are, and often deliver a better quality of service than humans.
Amazon is applying this to retail stores where the concept of a checkout is being replaced by customers simply walking out. By using data from smartphones, cameras, sensors, purchase histories and other data points, Amazon is making it possible for us to walk into a store, pick up what we need and walk out. Everything else is taken care of by machines.
This type of thinking and tool creation is in its earliest infancy but will continue to address problems that are of more value to our lives.
In the book ‘The Inevitable’, Kevin Keely suggests that the next 10,000 startup businesses will be based on bringing Artificial Intelligence to something, like what happened during the industrial revolution when everything was electrified. We’ve seen washing machines, for example, go from manual operation, to electric and now to having a network port and some level of AI.
The intelligence revolution is happening all around us. It will be very disruptive within the security and surveillance industry — but also insightful and liberating as we free human efforts for higher cognitive processes that address the larger challenges.