Making Job Search Easier by Finding the Great Companies First

Find a
Title/Key­words Com­pa­ny Name
Where Search
City, state or zip (option­al)
City, state or zip (option­al)
Job title, key­words Com­pa­ny Name Only


Diffbot — Machine learning is the new Big Data and Giving it away might be a Money Maker.

“Everything’s becoming intelligent, but the limiting factor of intelligence is access to structured data,” Tung says.

Diff­bot, an arti­fi­cial intel­li­gence com­pa­ny that helps clients extract and com­bine data from mul­ti­ple Web sources wants to scrape all the data on the web (all of it) to put it into a struc­tured for­mat. Mak­ing it use­ful for all sorts of busi­ness pur­pos­es and make mon­ey doing so.  The com­pa­ny says its tech­nol­o­gy “uses com­put­er vision and NLP algo­rithms to extract and struc­ture any web page into the world’s largest struc­tured data­base… with no human cura­tion or oversight.”

Found­ed in 2009 The Palo Alto, CA-based start­up announced today it raised $10 mil­lion from investors to expand its “knowl­edge-as-a-ser­vice” offer­ings to busi­ness­es and con­sumer apps.  They have raised close to $13 mil­lion since its seed round in 2012.  Diffbot’s plan is to cat­a­log tril­lions of facts across the Web—many of them drawn from page ele­ments such as com­ment forums, which can’t be mined by tra­di­tion­al search engines.

Web-min­ing can be a com­pet­i­tive advan­tage for apps as well as the pro­lif­er­at­ing devices of the Inter­net of Things, Tung says.

The start­up says it has made a sig­nif­i­cant start on that goal, hav­ing indexed 1.2 bil­lion enti­ties such as peo­ple, prod­ucts, and places since the mid­dle of last year. Its Glob­al Index also encom­pass­es 10 to 20 times that num­ber of facts, says Diff­bot founder and CEO Mike Tung. Last June, the com­pa­ny said its data­base had sur­passed the size of Google’s Knowl­edge Graph.

Diff­bot does more than search what peo­ple are say­ing on their Twit­ter and Face­book feeds. It looks at com­ment threads in Red­dit and cus­tomer sup­port forums, basi­cal­ly every­where on the web.  By struc­tur­ing all that wild­ly unstruc­tured data, Diff­bot makes it search­able and thus use­ful.  Small com­pa­nies can get start­ed for free. Big com­pa­nies pay based on the vol­ume of data they need to access.

The startup’s key ear­ly inno­va­tion was to extend the search func­tion into pre­vi­ous­ly unchart­ed ter­ri­to­ry by teach­ing com­put­ers how to rec­og­nize the var­i­ous sub-sec­tions of Web pages, includ­ing head­lines, ad box­es, pic­tures, and dis­cus­sion threads.  Diff­bot could then clas­si­fy each page by type, such as news arti­cles and prod­uct pages. That knowl­edge allows the com­put­ers to find and assem­ble relat­ed infor­ma­tion, such as prod­uct prices across var­i­ous retail­ers, and con­sumer opin­ions across many social media plat­forms and com­ment sec­tions. The tech­nol­o­gy cre­ates “struc­tured data” that machines can read and inter­pret, so says Diffbot.

Diff­bot has been scal­ing up its data cen­ter, adding to its bank of pro­pri­etary servers with spe­cial­ized hard­ware, and inte­grat­ing Web-based pro­cess­ing pow­er into the sys­tem to meet surges of demand. The company’s new mon­ey will accel­er­ate the scale-up and fund an expan­sion of its R&D team, Tung says.  Diff­bot works in any lan­guage, Tung says. “It can tell you who the speak­ers are, and what they’re say­ing,” he says. The company’s tech­nol­o­gy is “suf­fi­cient­ly pow­er­ful to reduce infor­ma­tion asymmetry.”

“We’ve proven it’s pos­si­ble to build a prof­itable AI busi­ness mod­el,” Tung says.

With more than 250 customers—including Ama­zon, CBS Inter­ac­tive, eBay, Microsoft, Sales­force —Diff­bot became prof­itable at the end of 2015, Tung says.  The research groups at Google and Face­book are Diffbot’s clos­est rivals in the devel­op­ment of meth­ods to gath­er and syn­the­size Web data using arti­fi­cial intel­li­gence tech­nol­o­gy, Tung says. But rather than keep­ing the knowl­edge in-house, Diff­bot is mak­ing it avail­able to out­side companies.

“We’re sort of like Switzer­land in the AI wars,” Tung says.

It’s worth not­ing far larg­er com­pa­nies are strug­gling to find a good busi­ness mod­el for AI or  cog­ni­tive com­put­ing or what­ev­er the next name for this self-teach­ing tech­nol­o­gy will be. Tung says their oper­at­ing expens­es are low because Diffbot’s auto­mat­ed data col­lec­tion and analy­sis tech­nol­o­gy requires no human cura­tion, he says.

The sec­ond goal for Diffbot’s $10 mil­lion Series A financ­ing round was to make alliances with investors expe­ri­enced in arti­fi­cial intel­li­gence, Tung says. The round was led by Ten­cent, China’s lead­ing Inter­net ser­vice provider, and Feli­cis Ven­tures. Ten­cent is not a cus­tomer of Diffbot’s, Tung says. He adds the word “now.”

Oth­er star­tups are pur­su­ing a sim­i­lar AI-as-a-ser­vice mod­el, rec­og­niz­ing that while the Inter­net giants have the resources to push the enve­lope in things like com­put­er vision and nat­ur­al lan­guage under­stand­ing, lots of com­pa­nies can ben­e­fit from these technologies.

Believe it or not Diff­bot still has a tiny staff of 14 people.

Print Friendly, PDF & Email