Information versus knowledge: when an image is worth more than a thousand words


Post originally published at the Recercaixa Blog.

Josep Lladòs, Computer Vision Center Director

Last year, at a debate organised by La Caixa, we talked about the challenges in the digital era: knowledge versus information. We discussed over the demands and opportunities, as well as the risks we are facing at the information society revolution. The internet is adjusting information consumption habits and allowing us to define new paradigms in which citizens can access data on the web in an ubiquitous, universal and immediate manner. Similarly, the volume of information generated along with its difficulty to be processed, as well as the source’s reputation, makes the step from information to knowledge to be a veritable challenge and need.

Facts speak on their very own: 278.000 tweets are generated every minute, 3600 pictures are posted at instragram, 72 hours of video are uploaded to youtube, 204 million emails are sent, 2 million searches at Google are happening. If we want all this information to be of any use, we need to turn it into knowledge. In other words, the receptors have to receive it “treated” and correlated. Artificial Intelligence techniques able to extract the information’s semantics are increasingly necessary. We’re submerged an era of big data where analytical processes are crucial to transform information into knowledge.

The case of the information interpretation contained in images is a most useful example. Computer vision can be defined like the discipline that develops computer programs which give machines vision. To see is to interpret visual information, to turn pixels, as an elemental information unit, into knowledge. Computer vision has been, in the previous years an emergent and ubiquitous Technology. On a daily basis, we use camera incorporated devices containing vision programs (as are night cameras to watch after new-borns, incorporated cameras in videogame consoles which detect our gestures and make the game avatars move, cameras that read car plates in the entrance of parking sites, cameras that detect if a ball is in or out at a tennis match or if a football Player is out of game, etc.) Vision is a technology increasingly being demanded in sectors such as transport, health or security. It has been calculated that the vision market will grow a 40% annually up until 2020.

When digital images belong to scanned or photographic documents, we refer to the subarea called analysis and image recognition of documents which focuses on the issue of the document’s content automatic recognition (be it printed text, written or handwritten, as well as graphic elements). Historical archives and libraries contain millions of documents, many of them manuscripts, which contain the historical memory of societies.

These documents have been inaccessible to the general public. For some years now, there have been large digitisation campaigns which allow, at least, to publish those documents online. Nevertheless, placing the images to the public domain without a degree of structure and index is highly inefficient. It is necessary to transcribe those documents and thus give them a structure and only then, the interested public will be able to consume the knowledge they store.

At the EINES project, financed by Recercaixa, we are focused in documents that contain demographic information, particularly population census. This project has put together a multidisciplinary research team from the Computer Vision Center (computer engineers) and the Demographics Center (social sciences), both at the Autonomous University of Barcelona (UAB). The aim of this project is the extraction of information from the digitised images of historical census (more than a hundred years) and the subsequent analysis. The extraction has to go beyond literal transcription, identifying nominal entities (names, places, dates, professions, etc.). The dataset resulting from this, properly structured and indexed is our door to assimilate the knowledge from the past. With this information, openly shared, professionals and citizens can track community evolution, genealogies, individual trajectories, etc. At this point, we can state that one image contains more than a thousand words. The interpretation of the image can, in this context, interpret the past.

In the EINES project, the image’s content extraction process takes place by two means. Firstly, computer vision technologies allow the documents to be ‘read’ in an automatic way. It must be stated at this point that technology isn’t yet mature enough to guarantee a totally automatic transcription. There’s plenty of research yet to be done. At this point is where the intervention of citizens is valuable. Digital networks offer the possibility of ‘democratizing’ the generation of knowledge by the use of crowdsourcing platforms. It is by this means that, at this project, several voluntary citizens are participating in the extraction process. This should not be seen like an altruist work, but as the triggering of social innovation. In new innovation models, there are ecosystems in which the challenges are undertaken with the active inclusion of citizens. When speaking of the recovery of the memory from historical documental sources, citizens, being natural archives, give complementary knowledge of a great value.

In conclusion, new technologies are instruments that serve to the challenges generated by the exponential increase of the information at the network, and its transformation into knowledge. In the world of images, the interpretation of content is fundamental, and computer vision comes out as a facilitating technology. Furthermore, we can not forget, but drive the intervention of users within this process. The new innovation models around the so called citizen science empower citizens and makes them the subjects of the knowledge generation. Digital humanities and the interpretation of big volumes of images in archived documents in processes of assisted transcription for the use of technology are a great examples of this.

EINES Project in the media:



La Vanguardia, 25/05/2016, ‘Sant Feliu de Llobregat recupera el legado de sus antepasados

El Mundo, 26/05/2016, ‘Sant Feliu desentraña su primera ‘red social’ que se remonta a 1828

Cadena ser, 04/06/2016. Listen to the radio clip here.





Related articles: Xarxes: Connecting the lives of our ancestors

read more

The future of autonomous cars: understanding the city with the use of videogames


Researchers of the Computer Vision Center in Barcelona have created a new virtual world with the aim of teaching autonomous driving vehicles to see and comprehend a city.  

Currently, autonomous vehicles -such as the Google Car or Tesla models- need to develop a “core intelligence” which will allow them to identify and recognize visually different elements, such as a road, sidewalks, buildings, pedestrians, etc. In short: to see and understand a road like humans do. The project is promoted by researcher Germán ros along with Dr. Antonio M. López, both from the Computer vision Center in Barcelona.

 As Mr. German Ros puts it: “These vehicles need the use of Artificial intelligence (IAs) to understand what is happening around them. This is achieved with the construction of artificial systems which simulate the structure and functioning of human neuronal connections. Our new simulator, SYNTHIA is a huge step forward within this process”.

SYNTHIA (which stands for ‘System of synthetic images’) is able to accelerate and improve the way in which artificial intelligence learn to understand the city and their elements. This is a significant advancement in one of the major challenges within this scientific area. The data generated by the simulator will be delivered openly to the scientific community in Las Vegas in the International Conference on computer Vision and Pattern Recognition. With this, researchers want to trigger the scientific advancement in areas such as artificial intelligence and autonomous driving.

Up till now, the main limitation in the development of artificial intelligence was the big volume of data and human work required for IAs to learn complex visual concepts in diverse conditions (as for example the difference between a road and the sidewalk in a rainy day). A tedious and expensive process which would require a big number of hours of human supervision.

SYNTHIA is therefore a revolution. It makes use of a virtual simulator in order to generate artificial intelligence in a simple and automatic way (with no human intervention). Thanks to this advancement, the typical limitations of human work (time and errors) are left behind making the process much cheaper and opening the door to the development of more sophisticated and secure systems for autonomous driving.

SYNTHIA in the news:

This Project has been cofounded by the European Union by its European Regional Development Funds program.

read more

Developing the bronchoscope of tomorrow


A combined group of computer scientists and pulmonologists have created the bronchoscope of the future combining computer vision and videogame technology in Barcelona. Incorporating cutting edge technology to regular bronchoscopes, they have given doctors a tool for improved and more precise diagnosis; and patients a quicker, easier medical intervention.

The project, led by Dr. Debora Gil, senior researcher at the Computer Vision Center, has developed a technology that allows a precise calculus of tracheal stenosis in a record time of 9 seconds. What’s more, it creates a personalized map of the patient’s lungs and thus is able to give extremely precise measurements for bronchi stents.

Medical image had not advanced in bronchoscopy treatments for the last 20 years. Looking around me,  technology in general was advancing at such a fast pace, everywhere! However, I couldn’t see any improvement in my day to day practice. I was determined to change that”, states Dr. Antoni Rosell, pulmonologist at Barcelona’s top University Bellvitge-Idibell Hospital and partner of the project. Dr. Gil understood the surgeons’ needs instantly and acknowledged the challenges: “We had to develop a totally new and revolutionary system, but we weren’t able to introduce new instruments within the operating room, we needed to work with technology doctors were already familiarized with”.

A digital bronchial map

This new, groundbreaking medical device uses computer vision to create a 3D representation of a lung, featuring the respiratory tract, bronchi and bronchioles, with the help of augmented reality. Original images are obtained from computer tomography scans and are turned into a precise virtual respiratory system map created with videogame technology. Furthermore, the software includes a digital representation of a bronchoscope enabling surgeons to prepare the pathway of the bronchoscopy prior to the actual intervention by simulating the interaction between a bronchoscope and the bronchial walls.

Dr. Gil puts it graphically: “Imagine this apparatus”, bronchoscope in hand, “which has a mechanical movement, only upwards and downwards, that’s all it does”. “Imagine”, she repeats, “that you have to memorize the pathway down the trachea, bronchi and bronchioles in order to reach the tumor. To do this relying only on memory is a complex and unnecessary task.  Left, right, up, down, but once you’re in and out, right and forth you’ve forgotten which is up, which is down and where else to go. You literally get lost inside your patient’s lungs!”.

Practicing beforehand is key. The software allows the doctor to mark and practice the pathways inside the patient’s virtual lung, just like a GPS, and, most importantly, to see if the internal route planned is even possible. Bronchioles might be too thin or obstructed hence making the way impossible. “What this means for the patient”, explains Dr. Rosell, “is that it makes the process less painful, quicker and easier. In the long run, it would make hospitals more efficient as bronchoscopies would get easier and quicker”.

Tracheal stenosis calculus in record time

Dr. Carles Sánchez is also part of Dr. Debora Gil’s team, and  is responsible for the software that makes the calculus of tracheal stenosis, a feature deployed in this medical system. Tracheal stenosis is the abnormal narrowing of the central air passageways. Up to now, the percentage narrowing of a patient’s trachea was all left to good old subjectivity. With diagnosis being made by eye, variability was too high. “In our area of especialization“, states Dr. Rosell, “an objective tool for calculus is essential. Not only in diagnosis, but also in the treatment’s follow upIf it is objective, it is reproducible and does not depend on the experience of the operator. Having this data allows us to see if the treatment works is effective, and therefore make proper decisions”, continues Dr. Rosell.

The process of obtaining an experienced ‘artificial’ eye has been cumbersome. The software has been trained to detect stenosis just like a physician would do by processing thousands of images of patients with Tracheal stenosis. Dr. Sanchez designed the algorithms that detect the different narrowing grades, with the delivery of a diagnosis in a record time of only 9 seconds.

Accurate prosthesis

The third application of this technology focuses on the accurate measuring of stents. With the use of augmented reality, measuring the insides of the patient’s lung is easier than ever, producing an extremely accurate calculation of any part of the respiratory system. This allows experts not to miss a single millimeter when creating prosthesis.

Inside our body, 1 mm is a huge distance. By simulating the patient’s lung at a precise scale, we can calculate prosthesis sizes with an incredible accuracy”, appoints by Dr. Rosell, “Each prosthesis is within 1000-2000€ worth, therefore, getting one wrong is costly. A tool such as ours not only can help save money, it will also reduce patient time in surgery, the use of anesthesia, hospital resources, the patient and the doctor’s time, etc.

 The future of bronchoscopy

When asked about the future of bronchoscopy one thing is clear: this technology needs to reach industry. Researchers are now in the lookout for a company  that will deploy these advances in the new generations of bronchoscopes.

Will we see this in hospitals any time soon? Dr. Antoni Rosell sighs, “ten, maybe five years. It all depends on the private sector. The talent is here, the concept is designed and the research has been tested. We have reached our goal as a research group, it is now time for business”.

The Project has been funded by:







In the Media:

Zdnet, 04/04/2017:

Barcelona FM (podcast in Catalan):

read more
CVC LabsNews

Towards a no driver scenario: autonomous and connected cars at the Computer Vision Center


70% of the population is expected to live in cities by 2050, therefore intelligent mobility is now, more than ever, a pressing topic. Within this context, a project lead by CVC researchers Dr. Antonio López and Dr. David Vázquez, is accomplishing positive results within their platform, Elektra, where they develop research within the area of computer vision and deep learning to advance in autonomous driving in city scenarios.

As Dr. López explains, the European Commission considers autonomous driving as one of the top ten technologies that will drastically change citizen’s life. Not only will it reduce accidents but will help include citizens with low physical mobility, make transport more efficient and therefore lower our carbon /petrol dependence.

Elektra is born as an autonomous driving platform designed in the context of our project ACDC”, as stated by Dr. Antonio López. The ACDC project (Automatic and Collaborative Driving in the City) points its research in computer vision for ADAS towards a level 5 automation, (the scale being from 1 to 5, and 5 meaning no driver when driving in the city). A most challenging project indeed, only possible with a clear synergy of different research groups and enterprises.

Elektra is formed by more than 20 professionals from different backgrounds, all summing up to the project. Within the research groups we find the CVC ADAS (Automatic Driving Assistance Systems) group, the CAOS (Computer architecture & Operating Systems) group at the UAB (Universitat Autonoma de Barcelona); the Research Center of Supervision, Safety and Automatic Control at UPC (Universitat Politècnica de Catalunya); the CTTC (Telecommunications Technical Center of Catalonia) and the IEEC (Institute of Space Studies of Catalonia) as well as the UAB-DEIC-Senda (Department of Information and Communications Engineering – Security of Networks and Distributed Applications) research group and the UAB-CEPHIS (Center of Prototypes and solutions Hardware-Software) team. Within businesses, Elektra has the input of the know-how of CT Ingenieros, a Barcelona based company dedicated to innovation within engineering in different infrastructure sectors.

The project, with an electric prototype, is highly relying on computer vision techniques for perception (stereo, stixels, obstacle detection, scene understanding) which tend to be computationally demanding, localization (GPS + IMU and vision) and navigation (Control and Planning). With this, the group has arrived to their first milestone, which is to “move autonomously from a starting point to a final point in a comfortable way and controlling that the trajectory always corresponds to free-navigable space and without disturbing pedestrians”.

Cameras, being passive sensors, have been preferred for the AI community when regarding driving. “Images give us a high amount of information to drive”, as explained by Dr. López, “after all, this is how humans do it. We therefore, needed to give the car the ability to interpret the information around him”. That includes pedestrian (obstacle) detection, free navigable space detection, localization and route planning. “Of course, cameras aren’t as precise as a human eye. Autonomous driving can’t only be possible with the use of computer vision, but it is a great ally”, as clarified by Dr. López.

As Dr. Vázquez sees it: “In order to have a car that can drive you need several things. Firstly, accurate pedestrian (obstacle) detection, in which CVC is definitely pioneer. Secondly, free navigable space detection, which is no more than detecting the lane without obstacles or interferences. Thirdly, localization. The car needs to know where it is at and where it is going towards. Fourthly, planning. The car has to plan its way from point A to point B in the smoothest way possible and thus define a global trajectory. And last but not least, control: to execute the motion plan performing the necessary manoeuvres”.

But cameras aren’t the only way to grant the car with perceptive abilities. Other technology used is based in sensors such as Lidar and Radar, involving raw data which has a more direct interpretation and thus provides an accurate distance estimation in different environmental conditions. The problems here are costs, these sensors being highly expensive, especially when compared to cameras, and the poor resolution of the data as well as the lack of details when capturing the world’s appearance. Visual data is comparatively far much richer in complexity and detail, the challenge being not only to give cars the ability to see and interpret, but to be able to take decisions when faced with different circumstances.

Related article: The future of autonomous cars: Understanding the city with the use of videogames.

More information about the project at the Elektra website:

read more
1 33 34 35
Page 35 of 35