Home  »  Helping to deal with dirty data

Helping to deal with dirty data

Data. It’s the raw material of technological advancement.

But data can be a dirty, inconsistent mess, and that can be a huge obstacle for under-resourced startups looking to use that data to fuel their AI and machine-learning innovations.

This is where Communitech’s Extended Data Concierge service comes in, linking founders with data science experts who can assume some of the heavy lifting to clean, organize and label the data that will move their enterprises forward.

Wladimir Daurel, Program Manager of Advanced Technology Platforms for Communitech, and co-ordinator of the data concierge service, says that startups have limited human resources, who can easily be tied up cleaning data, “which is not the highest value work that they could be doing. Managing that data, wrangling that data, takes up a lot of time.”

The value proposition for the service is that it helps founders connect with data science partners and can thereby “free up their internal resources so that they can tackle other projects they had on the shelf, or so they can spend the time to improve their algorithm...”

So far, the Extended Data Concierge service has helped to complete 45 projects over the past 15 months. One requirement for admission into the program is that the startup have a project that could involve one of the three core project goals of data acquisition, enrichment and preparation; data engineering; and data science, regarding autonomous vehicles, HD mapping, 5G or next-gen cloud computing.

Daurel says the process is simple: A customer comes to the Extended Data Concierge service with an intention, and would be put in touch with a vendor. “The vendors would draw up a plan and we come up with a schedule of milestones and deliverables, what action items and what type of tools are needed.”

Daurel says the program has been very effective. One project, for instance, accomplished 480 hours of data labelling over three months. Another project resulted in an algorithm being fine-tuned to 95-per-cent accuracy, from the previous 82-per-cent accuracy rate.

Key to this is the experience of the vendors, including program veterans Ample Insight, Beam Data, and SBX Robotics. And, the experience is a learning opportunity for both the data science experts and the startup founders.

The 20 staffers at four-year-old Toronto-based Ample Insight have been involved with the service for a year, working on more than two dozen projects. 

CEO Garry Ma says that Ample has helped startups in medtech, agritech, cybersecurity, transportation, environmental analytics and real estate. And much of the company’s expertise comes from its experience providing advice and services to a diverse set of global clients across a variety of industries outside of the data concierge service, on such matters as data strategy, AI and machine learning, analytics, data engineering and data labelling. 

“Beyond the purely technical projects, we also help startups design products that need to surface a large amount of complex data or sophisticated AI. We specialize in simplifying complex data visualizations and turning them into something that is user-friendly and engaging for non-technical users, 

As well, Ma says that Ample has worked with companies that have gotten into the weeds of the data problem on their own. Ample brings fresh eyes to the problem: “…we take a step back and look at their overall goal, and sometimes, we need to redefine the problem that they are trying to solve. And a lot of times, we can redefine the problem so that it is more closely aligned to their actual goal.”

“That actually takes a bit of time. But identifying and scoping the right problem is half the battle, and oftentimes makes the project more straightforward, and better aligns the key stakeholders.”

Helping a startup refine, or redefine, its data strategy “is actually a big value for these companies.”

Five-year-old Beam Data was the first vendor for the data concierge service, says Beam co-founder Shaohua Zhang. “I even proposed the idea of the Extended Data Concierge.” The company’s 15 to 20 Toronto-based staffers have worked for hours in some cases, and for weeks in others, helping startups reduce their R&D costs as they test their products: “Something might fail,” says Zhang. “They might want to try something out, but they need the resources to do that.”

Zhang said each project may have two or three Beam staff working on it, with several projects supervised by a project manager. There are weekly meetings with the clients and daily meetings with the team. “We provide summary reports that might include, ‘Here’s your next step’ or . . . create a document that tells them, ‘Here are three things you can do in the future when you have the resources.’ And we may work with them outside the scope of the Extended Data Concierge.” Several companies have hired Beam after their time in the data concierge service.

Of course, Beam continues its work with traditional businesses. One is an American car-auction firm with a lot of legacy systems: “We are helping them modernize their data stacks. We’re helping them move to the cloud, helping them build a data warehouse.” 

That kind of work is typical for a data consultancy, but the data concierge service offers opportunities for vendors to learn, too. “With the Extended Data Concierge projects . . .  it’s very exciting. We were involved in things like tuning deep learning models for autonomous driving. We’ve helped IoT startups.”

Among the projects that Beam has collaborated on include data labelling for a computer vision company (“We labelled 2,000 to 3,000 images in a relatively short time period.”); helped a company improve its deep learning model accuracy by 10 per cent; and helped a health-care app company build its first BI (business intelligence) dashboard.

Zhang says the different use cases have “definitely helped Beam become a better consulting company, so in the future we can potentially take on many different types of projects.”

Garry Ma agrees that the data concierge service has benefits for both vendor and client: “Of course we learned from being exposed to sometimes very unique problems. It helps us as an organization to grow. 

“To see what people are working on, obviously the IP and confidential information of our clients is very important, but the overall pattern of the industry, the different types of industries and their different directions, has been helpful for us.”

As well, working on challenging problems is invigorating for Ample’s data scientists: “They have the opportunity to work with many startups, to be solving really interesting problems. Sometimes really difficult problems, but they are here for that challenge and are inspired by it. It has been very rewarding when we are able to solve those problems for our clients.”

Communitech’s Wladimir Daurel says that although the Extended Data Concierge service cycle ended March 31, work has begun on Version 2.0, which will be available in the spring of this year.

The success of the service has been a high point for Daurel and Communitech: “We are incredibly happy with the work being done by our partners. It has helped us deliver substantial value to the founders we support.”



Communitech
https://communitech.ca
"Communitech helps tech-driven companies start, grow and succeed. Communitech was founded in 1997 by a group of entrepreneurs committed to making Waterloo Region a global innovation leader. At the time it was crazy talk, but somehow this community managed to pull it off. Today, Communitech is a public-private innovation hub that supports a community of more than 1400 companies — from startups to scale-ups to large global players. Communitech helps tech companies start, grow and succeed in three distinct ways: - Communitech is a place – the center of gravity for entrepreneurs and innovators. A clubhouse for building cool shit and great companies. - Communitech delivers programs – helping companies at all stages with access to capital, customers and talent. We are here to help them grow and innovate. - Communitech partners in building a world-leading ecosystem – making sure we have all the ingredients (and the brand) to go from a small startup to a global giant."

This website uses cookies to save your preferences, and track popular pages. Cookies ensure we do not require visitors to register, login, or share any identity information.