Google Hor

advertisement

Tech News

Tencent to open AI research center in Seattle

Tencent to open AI research center in Seattle

Chinese tech conglomerate Tencent will be opening a new AI research center in Seattle,

...

Someone scraped 40,000 Tinder selfies to make a facial dataset for AI experiments

Someone scraped 40,000 Tinder selfies to make a facial dataset for AI experiments

Tinder users have many motives for uploading their likeness to the dating app. But contributing a facial biometric to a downloadable dataset for training convolutional neural networks probably wasn’t top of their list when they signed up to swipe.

A user of Kaggle, a platform for machine learning and data science competitions which was recently acquired by Google , has uploaded a facial dataset he says was created by exploiting Tinder’s API to scrape 40,000 profile photos from Bay Area users of the dating app — 20,000 apiece from profiles of each gender.

The dataset, called  People of Tinder , consists of six downloadable zip files, with four containing around 10,000 profile photos each and two files with sample sets of around 500 images per gender.

Some users have had multiple photos scraped from their profiles, so there is likely a lot less than 40,000 Tinder users represented here.

The creator of the dataset, Stuart Colianni, has released it under a  CC0: Public Domain License  and also uploaded his scraper script to  GitHub.

He describes it as a “simple script to scrape Tinder profile photos for the purpose of creating a facial dataset”, saying his inspiration for creating the scraper was disappointment working with other facial datasets. He also describes Tinder as offering “near unlimited access to create a facial data set” and says scraping the app offers “an extremely efficient way to collect such data”.

“I have often been disappointed,” he writes of other facial datasets. “The datasets tend to be extremely strict in their structure, and are usually too small. Tinder gives you access to thousands of people within miles of you. Why not leverage Tinder to build a better, larger facial dataset?”

Why not — except, perhaps, the privacy of thousands of individuals whose facial biometrics you’re dumping online in a mass repository for public repurposing, entirely without their say-so.

Glancing through a few of the images from one of the downloadable files they certainly look like the sort of quasi-intimate photos people use for profiles on Tinder (or indeed, for other online social apps) — with a mix of selfies, friend group shots and random stuff like photos of cute animals or memes. It’s by no means a flawless dataset if it’s just faces you’re looking for.

Reverse image searching several of the photos mostly drew blanks for exact matches online so it appears that many of the photos have not been uploaded to the open web — though I was able to identify one profile image via this method: a student at San Jose State University, who had used the same image for another social profile.

She confirmed to TechCrunch she had joined Tinder “briefly a while back”, and said she doesn’t really use it anymore. Asked if she was happy at her data being repurposed to feed an AI model she told us: “I don’t like the idea of people using my pictures for some sad ‘researches’.” She preferred not to be identified for this article.

Colianni writes that he plans to use the dataset with Google’s TensorFlow’s Inception (for training image classifiers) to try and create a convolutional neural network

...

California orders RydenGo to shut down its website

California orders RydenGo to shut down its website

On April 13, the state of California issued a cease and desist letter to

...

Data management startup Rubrik confirms $180M round at a $1.3B valuation

Data management startup Rubrik confirms $180M round at a $1.3B valuation

Rubrik , a startup that provides data backup and recovery services for enterprises across both cloud and on-premises environments, has closed a $180 million round of funding that values the company at $1.3 billion. The news confirms a report we ran earlier this week noting that the company was raising between $150 million and $200 million.

IVP (as we noted sources told us might be the case) led the round, with Lightspeed and Greylock also participating.

The funding, co-founder Bipul Sinha told us, comes as the company hit a $100 million runrate in January of this year. “We’ve had significant traction and wanted to double down to capture the market demand that we were experiencing,” he said.

The company last year was valued at $600 million when it raised its last round of $60 million, and while it’s not yet profitable, the cash it has been generating has been enough to fuel its growth up to now.

“We have   not touched the capital from our last round,” said Sinha. “We have 60 million in the bank right now and we were not looking to raise capital, but we got a very strong preemptive interest across multiple investors and we decided to pull the trigger to double down on engineering and marketing.” 

 

Rubrik’s services today mainly run using an appliance that an enterprise uses to back up, restore, and index data across both on-premises and cloud-based environments — a hybrid that represents the norm for most large organizations. Earlier this week, the company released a new product that runs natively in the cloud, bypassing the need for the appliance. It’s this that spells the future direction for the company, Sinha told TechCrunch: it will be building more cloud-first products going forward in areas that complement what it is already doing in backup and recovery, like security.

The company competes with the likes of Druva, CommVault and EMC, but the reason why it’s taken off as it has is because of its new and efficient approach to an old problem. “We have woken up a sleepy market,” Sinha said.

Rubrik’s beginnings makes for an interesting and instructive story for people who are looking at what might be an interesting area to tap for a startup. Sinha, coming from the world of VC, was used to hunting out and looking for gaps and subsequent opportunities in the market.

“I had been looking at the backup and recovery market,” he said, “and realised that it hadn’t been innovated in ten years. I then looked at public cloud and wondered how will it be protected in the longer term. The two go together: how we can marry them and define a new standard?” From that, he called to consult with a friend, “who is now our CTO, and then two others” — these are Arvind Jain (ex-Google engineer), Soham Mazumdar (an engineer founder who sold Tagtile to Facebook and also is an ex-Googler); and Arvind Nithrakashyap (a storage and distributed system expert who is an alum of Rocketfuel

...

Auto site Carvana tumbles 26% in stock market debut

Auto site Carvana tumbles 26% in stock market debut

Carvana, the site for buying and selling cars, had a rough first day in the public markets. After pricing its IPO at $15 per share, it ended the day down 26% at $11.10.

...