Universal Data Tool
  • Universal Data Tool Docs
  • Installation
  • Running On-Premise
  • Collaborative Labeling
  • Building and Labeling Datasets
    • Image Segmentation
    • Image Classification
    • Text Classification
    • Named Entity Recognition
    • Entity Relations / Part of Speech Tagging
    • Audio Transcription
    • Data Entry
    • Video Segmentation
    • Composite Interfaces
    • Landmark / Pose Annotation
  • Importing Data
    • Upload or Open Directories
    • Import File URLs
    • Import COCO Images
    • Import from Google Drive
    • Import from AWS S3 Bucket
    • Import from CSV or JSON
    • Import using AWS Cognito
    • Import Text Snippets
  • The Format .udt.json
    • What is the .udt.json format?
    • What is the .udt.csv format?
  • Machine Learning
    • Jupyter Notebook Integration
    • Import Datasets into Pandas
    • Fast.ai
      • Fast.ai Image Classification
      • Fast.ai Image Segmentation
  • Integrate with Any Web Page
    • Integrate with the Javascript Library
    • Getting Started with React
  • Train your Workforce
    • Getting Started with UDT Courses
  • Frequently Asked Questions
Powered by GitBook
On this page
  • Example Dataset
  • Import CSV Into Pandas Dataframe
  • Download Images
  • Create an ImageDataBunch
  • Train a Model

Was this helpful?

  1. Machine Learning
  2. Fast.ai

Fast.ai Image Classification

Quickly import *.udt.csv files into fast.ai for image classification.

PreviousFast.aiNextFast.ai Image Segmentation

Last updated 4 years ago

Was this helpful?

Example Dataset

We're going to use , a dataset of labeled images of cats and dogs created from . For this guide you don't need to download it directly, because we'll load it in right from our notebook.

Don't like cats and dogs? You can also use any classification from the ! Maybe try to classify bears vs cats!

Import CSV Into Pandas Dataframe

We can begin by importing the fastai library, pandas, and our udt.csv file.

from fastai.vision import *
import pandas as pd

url_to_csv = "https://raw.githubusercontent.com/UniversalDataTool/udt-dataset-cats-and-dogs/master/coco_dogs_and_cats.udt.csv"
udt_csv = pd.read_csv(url_to_csv)

You can use the udt.json format too, tables are just a nice way to visualize the data!

Download Images

# Get the lines of our CSV that have sample data
samples = udt_csv[udt_csv["path"].str.contains("samples.")]

# Create two csvs that just have
# our cat image urls and dog image curls

dog_samples = samples[samples["annotation"] == "dog"]
cat_samples = samples[samples["annotation"] == "cat"]

dog_samples.to_csv("dog_urls.csv", columns=["imageUrl"], header=False, index=False)
cat_samples.to_csv("cat_urls.csv", columns=["imageUrl"], header=False, index=False)
# Now we can download the images!
download_images("dog_urls.csv", "images/dog" , max_pics=500)
download_images("cat_urls.csv", "images/cat" , max_pics=500)

# Let's make sure all the images are readable
verify_images("images/dog", delete=True, max_size=500)
verify_images("images/cat", delete=True, max_size=500)

Create an ImageDataBunch

From here, everything should should seem pretty normal. We can create an ImageDataBunch from our images directory.

data = ImageDataBunch.from_folder("images", train=".", valid_pct=0.2, ds_tfms=get_transforms(), size=224, num_workers=4).normalize(imagenet_stats)

# Let's take a look at the data
data.show_batch(rows=3, figsize=(7,8))

Train a Model

We can now train a model! This is just a simple one, don't forget to fine tune!

learn = cnn_learner(data, models.resnet34, metrics=error_rate)
learn.fit_one_cycle(4)

UDT Datasets just have links to images, so we'll need to download the actual images. Let's do that using the .

fast.ai download_images function
udt-dataset-cats-and-dogs
COCO
Common Objects in Context with the Import COCO button
coco_dogs_and_cats.udt.csv
dog_urls.csv
show_batch output
fit_one_cycle output