Setup and Manual Implementation


In this video we are going to get started creating a Symfony console application that allows us to read a CSV file from local disk, and turn that CSV file into a set of related Doctrine entities.

We aren't going to be creating the CSV reader from scratch. Instead, we will make use of the excellent league/csv data manipulation library. This will save us a whole bunch of time for very little effort. You are, of course, free to create your own CSV reader - or use an entirely different implementation - but we won't be focusing on the reader implementation in this series.

These videos have been created as a direct response to a question posed by a member here at CodeReviewVideos. The example CSV file we will be using is based on the data provided as part of that question. I have made slight alterations based on how I would approach this problem, but roughly this is "as asked".

Here's the question:

Do you have any tutorials for CSV importing to 2 nested tables?

Table 1 called Athletes: First Name, Last Name, DOB Table 2 called Competitors: Athlete,Category, Weight, Competition

CSV will include all Columns so at import the info accordingly

Thanks for the guidance

I get asked this question quite frequently, so thought it best to address this by way of both videos and some usable code.

As mentioned, I will make a tweak to this slightly - as I believe weight would be better assigned to the Athlete entity. I will explain more on this as we go through. You may disagree, and as ever, that is perfectly fine - I am concentrating on the high level overview here, rather than the specifics of this particular entity / database design.

Creating Fake CSV Data

You can either use my MOCK_DATA.csv file, or go ahead and generate your own.

Or you can use your own existing CSV data of course.

There are many ways to generate a CSV file full of fake data. The way in which I generated this MOCK_DATA.csv was to use Mockaroo. It worked well, has a nice web interface, and a bunch of options that made my task very straightforward.

Another way could be to use the league/csv package in its Writer mode, and create some fake data using fzaninotto/Faker for example. There are likely many other combos to do this.

Whatever method you prefer, about a thousand rows is best to toy around with this concept. Any fewer and you won't get enough data to properly experience some of the problems around relations, and any more will just make the process slower than it need be.

To give you a quick overview, our CSV file looks something like this:

first_name,last_name,date_of_birth,category,weight,competition
Jesse,Tucker,10/29/1939,category 1,150,competition 3
Phyllis,Price,10/24/1930,category 2,78,competition 3
Matthew,Butler,10/16/1903,category 2,87,competition 3
Janet,Robinson,07/15/1974,category 2,106,competition 2
Donna,Carter,07/24/1914,category 3,121,competition 3
Diane,Russell,03/10/1963,category 1,77,competition 1
Alice,Hicks,03/26/1953,category 2,74,competition 2
Rebecca,Kim,11/02/1970,category 2,102,competition 2
Walter,Thompson,11/06/1973,category 1,140,competition 3

The end goal here is to have two tables in our database.

One will contain the columns:

  • first_name
  • last_name
  • date_of_birth
  • weight

And the second will contain:

  • category
  • competition

There are only three categories: category 1, category 2, and category 3.

Likewise, there are only three competitions: competition 1, competition 2, and competition 3.

From this we can deduce there are only 9 possible combinations:

  • category 1 and competition 1
  • category 1 and competition 2
  • category 1 and competition 3
  • category 2 and competition 1
  • category 2 and competition 2
  • category 2 and competition 3
  • category 3 and competition 1
  • category 3 and competition 2
  • category 3 and competition 3

Therefore, we should expect to only ever have 9 records in our Competitor database table.

At this point, it would be remiss of me not to say that the wording on these entities is a little unclear. I am using the wording from the original question, so please take that into account. Competition would seem to be the "more proper" entity name here, in my opinion.

Whilst there are only ever likely 9 entries in the competitor table, there will be 1 entry per row of our CSV file for the Athlete entity. In other words, 1000 rows.

Creating The Entities

At this point we are going to start by creating the two entities that are needed to save off our individual records to the database.

If you are unsure on what a Doctrine entity is, or how we go about creating and using them, I would strong recommend the Doctrine Databasics series here at CodeReviewVideos.

To create the two entities required for this tutorial you can either:

  • copy / paste from the example project (also, code shown below)
  • type them out by hand
  • use the doctrine:generate:entity command

I would recommend following the php bin/console doctrine:generate:entity command as this will also create the associated Repository class for you. Also, it's nice and easy :)

This is the input needed to create the Athlete entity:

➜  csv-import-example git:(vid) ✗ php bin/console doctrine:generate:entity

  Welcome to the Doctrine2 entity generator

This command helps you generate Doctrine2 entities.

First, you need to give the entity name you want to generate.
You must use the shortcut notation like AcmeBlogBundle:Post.

The Entity shortcut name: AppBundle:Athlete

Determine the format to use for the mapping information.

Configuration format (yml, xml, php, or annotation) [annotation]:

Instead of starting with a blank entity, you can add some fields now.
Note that the primary key will be added automatically (named id).

Available types: array, simple_array, json_array, object,
boolean, integer, smallint, bigint, string, text, datetime, datetimetz,
date, time, decimal, float, binary, blob, guid.

New field name (press <return> to stop adding fields): firstName
Field type [string]:
Field length [255]:
Is nullable [false]:
Unique [false]:

New field name (press <return> to stop adding fields): lastName
Field type [string]:
Field length [255]:
Is nullable [false]:
Unique [false]:

New field name (press <return> to stop adding fields): dateOfBirth
Field type [string]: datetime
Is nullable [false]:
Unique [false]:

New field name (press <return> to stop adding fields): weight
Field type [string]: integer
Is nullable [false]:
Unique [false]:

New field name (press <return> to stop adding fields):

  Entity generation

  created ./src/AppBundle/Entity/Athlete.php
> Generating entity class src/AppBundle/Entity/Athlete.php: OK!
> Generating repository class src/AppBundle/Repository/AthleteRepository.php: OK!

And for the Competitor entity:

➜  csv-import-example git:(vid) ✗ php bin/console doctrine:generate:entity

  Welcome to the Doctrine2 entity generator

This command helps you generate Doctrine2 entities.

First, you need to give the entity name you want to generate.
You must use the shortcut notation like AcmeBlogBundle:Post.

The Entity shortcut name: AppBundle:Competitor

Determine the format to use for the mapping information.

Configuration format (yml, xml, php, or annotation) [annotation]:

Instead of starting with a blank entity, you can add some fields now.
Note that the primary key will be added automatically (named id).

Available types: array, simple_array, json_array, object,
boolean, integer, smallint, bigint, string, text, datetime, datetimetz,
date, time, decimal, float, binary, blob, guid.

New field name (press <return> to stop adding fields): category
Field type [string]:
Field length [255]:
Is nullable [false]:
Unique [false]:

New field name (press <return> to stop adding fields): competition
Field type [string]:
Field length [255]:
Is nullable [false]:
Unique [false]:

New field name (press <return> to stop adding fields):

  Entity generation

  created ./src/AppBundle/Entity/Competitor.php
> Generating entity class src/AppBundle/Entity/Competitor.php: OK!
> Generating repository class src/AppBundle/Repository/CompetitorRepository.php: OK!

This won't have created any new tables in your database, so don't worry if you don't see them just yet.

One thing the generator won't do, however, is create you a relationship. We will need to define this manually. If unsure on entity relationships, I would recommend you watch this video.

In this example, many Athletes can be in one Competitor (think: Competition, see the earlier note about grammar). Therefore we might want to add in a ManyToOne relationship from Athlete to Competitor:

    /**
     * @ORM\ManyToOne(targetEntity="AppBundle\Entity\Competitor")
     * @ORM\JoinColumn(name="competitor_id", referencedColumnName="id")
     */
    private $competitor;

This is better illustrated when viewing both entities in full:

<?php

// /src/AppBundle/Entity/Athlete.php

namespace AppBundle\Entity;

use Doctrine\ORM\Mapping as ORM;

/**
 * Athlete
 *
 * @ORM\Table(name="athlete")
 * @ORM\Entity(repositoryClass="AppBundle\Repository\AthleteRepository")
 */
class Athlete
{
    /**
     * @var int
     *
     * @ORM\Column(name="id", type="integer")
     * @ORM\Id
     * @ORM\GeneratedValue(strategy="AUTO")
     */
    private $id;

    /**
     * @ORM\ManyToOne(targetEntity="AppBundle\Entity\Competitor")
     * @ORM\JoinColumn(name="competitor_id", referencedColumnName="id")
     */
    private $competitor;

    /**
     * @var string
     *
     * @ORM\Column(name="first_name", type="string", length=255)
     */
    private $firstName;

    /**
     * @var string
     *
     * @ORM\Column(name="last_name", type="string", length=255)
     */
    private $lastName;

    /**
     * @var \DateTime
     *
     * @ORM\Column(name="date_of_birth", type="datetime")
     */
    private $dateOfBirth;

    /**
     * @var int
     *
     * @ORM\Column(name="weight", type="integer")
     */
    private $weight;

    /**
     * Get id
     *
     * @return int
     */
    public function getId()
    {
        return $this->id;
    }

    /**
     * @return mixed
     */
    public function getCompetitor()
    {
        return $this->competitor;
    }

    /**
     * @param Competitor $competitor
     *
     * @return Athlete
     */
    public function setCompetitor(Competitor $competitor)
    {
        $this->competitor = $competitor;

        return $this;
    }

    /**
     * Set firstName
     *
     * @param string $firstName
     *
     * @return Athlete
     */
    public function setFirstName($firstName)
    {
        $this->firstName = $firstName;

        return $this;
    }

    /**
     * Get firstName
     *
     * @return string
     */
    public function getFirstName()
    {
        return $this->firstName;
    }

    /**
     * Set lastName
     *
     * @param string $lastName
     *
     * @return Athlete
     */
    public function setLastName($lastName)
    {
        $this->lastName = $lastName;

        return $this;
    }

    /**
     * Get lastName
     *
     * @return string
     */
    public function getLastName()
    {
        return $this->lastName;
    }

    /**
     * Set dateOfBirth
     *
     * @param \DateTime $dateOfBirth
     *
     * @return Athlete
     */
    public function setDateOfBirth($dateOfBirth)
    {
        $this->dateOfBirth = $dateOfBirth;

        return $this;
    }

    /**
     * Get dateOfBirth
     *
     * @return \DateTime
     */
    public function getDateOfBirth()
    {
        return $this->dateOfBirth;
    }

    /**
     * Set weight
     *
     * @param integer $weight
     *
     * @return Competitor
     */
    public function setWeight($weight)
    {
        $this->weight = $weight;

        return $this;
    }

    /**
     * Get weight
     *
     * @return int
     */
    public function getWeight()
    {
        return $this->weight;
    }
}

and:

<?php

// /src/AppBundle/Entity/Competitor.php

namespace AppBundle\Entity;

use Doctrine\ORM\Mapping as ORM;

/**
 * Competitor
 *
 * @ORM\Table(name="competitor")
 * @ORM\Entity(repositoryClass="AppBundle\Repository\CompetitorRepository")
 */
class Competitor
{
    /**
     * @var int
     *
     * @ORM\Column(name="id", type="integer")
     * @ORM\Id
     * @ORM\GeneratedValue(strategy="AUTO")
     */
    private $id;

    /**
     * @var string
     *
     * @ORM\Column(name="category", type="string", length=255)
     */
    private $category;

    /**
     * @var string
     *
     * @ORM\Column(name="competition", type="string", length=255)
     */
    private $competition;

    /**
     * Get id
     *
     * @return int
     */
    public function getId()
    {
        return $this->id;
    }

    /**
     * Set category
     *
     * @param string $category
     *
     * @return Competitor
     */
    public function setCategory($category)
    {
        $this->category = $category;

        return $this;
    }

    /**
     * Get category
     *
     * @return string
     */
    public function getCategory()
    {
        return $this->category;
    }

    /**
     * Set competition
     *
     * @param string $competition
     *
     * @return Competitor
     */
    public function setCompetition($competition)
    {
        $this->competition = $competition;

        return $this;
    }

    /**
     * Get competition
     *
     * @return string
     */
    public function getCompetition()
    {
        return $this->competition;
    }
}

Now, we need to make sure our database is up to date with these changes. For this we need to update our database schema:

php bin/console doctrine:schema:update --force

In a real project I urge you to use Doctrine Migrations when changing your database in any way.

At this point you should be able to see your new tables in your database. The tool I use for this is SequelPro on a Mac. On Windows, or Linux (with Wine) I would recommend SqlYog, which is hands down the best tool for working with MySQL databases I have ever used (note: it's not free).

How To Create A Symfony Console Command

We are going to create a Symfony Console Command that will allow us to import our CSV data file by typing in:

php bin/console csv:import

To create a Symfony console command we need to follow a basic outline:

namespace AppBundle\Command;

use Symfony\Component\Console\Command\Command;
use Symfony\Component\Console\Input\InputInterface;
use Symfony\Component\Console\Output\OutputInterface;

class SomeConsoleCommand extends Command
{
    protected function configure()
    {
    }

    protected function execute(InputInterface $input, OutputInterface $output)
    {
    }
}

You can read more about this on the official documentation. I'm not going to pretend that you should memorise this code in any way. I always copy / paste from the docs (or another existing command in my project) when creating a new command. Another smart way to work might be to create yourself a Live Template to quickly recreate this code as needed.

One of the really nice parts about the Symfony framework is that you don't necessarily need to create web / MVC applications. It's particularly good at that, but you could have an entire application that just uses console commands. Or is a JSON API. Or some other variant that's not your 'typical' web app.

To begin with, we are going to create a console command that demonstrates how to create two entities, relate them, and persist (save) them to our database.

By breaking down our problem from "how to I read a CSV and save related entities off to the database" into two problems:

  • how do I process a CSV file
  • how do I relate two entities

This problem becomes much easier to tackle. After all, if we can create and related two entities by hand, then repeating this process for as many rows as we have in a CSV file shouldn't be that much harder, right?

Our Symfony console command starts off looking like this:

<?php

// /src/AppBundle/Command/CsvImportCommand.php

namespace AppBundle\Command;

use Symfony\Component\Console\Command\Command;
use Symfony\Component\Console\Input\InputInterface;
use Symfony\Component\Console\Output\OutputInterface;
use Symfony\Component\Console\Style\SymfonyStyle;

/**
 * Class CsvImportCommand
 * @package AppBundle\ConsoleCommand
 */
class CsvImportCommand extends Command
{
    /**
     * Configure
     * @throws \Symfony\Component\Console\Exception\InvalidArgumentException
     */
    protected function configure()
    {
        $this
            ->setName('csv:import')
            ->setDescription('Imports the mock CSV data file')
        ;
    }

    /**
     * @param InputInterface  $input
     * @param OutputInterface $output
     *
     * @return void
     */
    protected function execute(InputInterface $input, OutputInterface $output)
    {
        $io = new SymfonyStyle($input, $output);

        $io->success('Command exited cleanly!');
    }
}

Even though we are working with a Symfony console command, there's no reason it shouldn't look super-duper. By using the SymfonyStyle guide, we can easily build console commands that look just as good as they work :)

Aside from the use of SymfonyStyle, the only other new thing here is in defining the name we use for calling our command - csv:import. You can give this any name you like. What this changes is the way our command will be called:

php bin/console csv:import

Or:

    protected function configure()
    {
        $this
            ->setName('i:like:jam')
        ;
    }

php bin/console i:like:jam

And so on. There's no hard convention to follow here as best I am aware. Obviously you can't re-use the same command twice, and it's best not to re-use existing 'namespaces' - so don't create your commands like:

    protected function configure()
    {
        $this
            ->setName('doctrine:my:command')
        ;
    }

As doctrine is already in use. Aside from that, do what you like - but make it sensible. One convention might be to store everything under app. E.g app:csv:import. I don't follow that convention, but you might wish to, as it helps organisation.

At this stage we could run our command. Not very much will happen - you should just see a big green box saying "Command exited cleanly!", but it proves this thing works, right?

To run this command:

php bin/console csv:import

Or you could run:

php bin/console

And look in the output under csv to see the available commands in the csv 'namespace' - just the one in our case - and the description text we set in the configure method.

Manually Creating Entities

We now have a console command that runs, but doesn't do very much.

Let's make our command a little more interesting by manually defining an Athlete, and a Competitor, relate the two, and persist these two records off to the database.

<?php

// /src/AppBundle/Command/CsvImportCommand.php

namespace AppBundle\Command;

use AppBundle\Entity\Athlete;
use AppBundle\Entity\Competitor;
use Doctrine\ORM\EntityManagerInterface;
use Symfony\Component\Console\Command\Command;
use Symfony\Component\Console\Input\InputInterface;
use Symfony\Component\Console\Output\OutputInterface;
use Symfony\Component\Console\Style\SymfonyStyle;

/**
 * Class CsvImportCommand
 * @package AppBundle\ConsoleCommand
 */
class CsvImportCommand extends Command
{
    /**
     * @var EntityManagerInterface
     */
    private $em;

    /**
     * CsvImportCommand constructor.
     *
     * @param EntityManagerInterface $em
     *
     * @throws \Symfony\Component\Console\Exception\LogicException
     */
    public function __construct(EntityManagerInterface $em)
    {
        parent::__construct();

        $this->em = $em;
    }

    /**
     * Configure
     * @throws \Symfony\Component\Console\Exception\InvalidArgumentException
     */
    protected function configure()
    {
        $this
            ->setName('csv:import')
            ->setDescription('Imports the mock CSV data file')
        ;
    }

    /**
     * @param InputInterface  $input
     * @param OutputInterface $output
     *
     * @return void
     */
    protected function execute(InputInterface $input, OutputInterface $output)
    {
        $io = new SymfonyStyle($input, $output);

        // create new athlete
        $athlete = (new Athlete())
            ->setFirstName('tim')
            ->setLastName('jones')
            ->setDateOfBirth(new \DateTime('-30 years'))
            ->setWeight('123')
        ;

        $this->em->persist($athlete);

        // create new Competitor
        $competitor = (new Competitor())
            ->setCategory('category 1')
            ->setCompetition('competition 1')
        ;

        $this->em->persist($competitor);

        // relate the two
        $athlete->setCompetitor($competitor);

        // save / write the changes to the database
        $this->em->flush();

        $io->success('Command exited cleanly!');
    }
}

Ok, there's a fair amount of change here.

Let's start with the constructor.

We know we are going to need to work with the database. We could extend ContainerAwareCommand but I prefer not too. You can read more about the drawbacks of using ContainerAwareCommand in the official documentation.

Instead, we will define our console command as its own service. This is slightly more work, but the benefits outweigh the few extra lines of config, in my opinion. The benefits btw, being flexibility, and easier testing.

As we want to work with the database - and therefore Doctrine - we need to inject the Entity Manager. Rather than directly tie ourselves to a specific implementation, we will instead tell our console command that we will be injecting some class / concrete implementation that implements the EntityManagerInterface. Again, this provides flexibility, and easier testing.

Inside the constructor we must call the the __construct method in the class we are extending - Command in this case.

As we have injected our entity manager, we would also want to set this onto the newly created private $em class property at this point.

Then, inside the execute function we can start using those entities we defined earlier.

For the moment all the values are faked, and hardcoded. We will make these dynamic shortly.

Each entity should be persisted to ensure it will be written to the database when changes are next flushed.

We can then setCompetitor which relates the two, according to the relationship we created in our entities earlier.

Finally, we flush the changes to ensure our changes are written off to the database.

At this point, when running the command you should see a new record appear in each of your database tables each time you run / re-run the command. Of course, this isn't exactly what we want, but it proves that our entities work, and that the relationship is configured properly.

We've achieved a lot here. We've solved one of our two problems - creating and relating entities.

In the next video we will address the second problem - repeating this process for the 1000 rows in our CSV file.

Episodes

# Title Duration
1 Setup and Manual Implementation 04:43
2 Importing CSV Data The Easy Way 04:44