This post is part of a series:
In the previous post, we left off at the point where we wanted to see the most common commands for managing environments with Conda. And this is what we are going to do now. Environment-specific commandsCurrently, we are in the so-called “base” environment which is indicated by the name in the parentheses at the front of the command prompt. See slide 1 This environment exists right from the beginning, but one shouldn’t actually use it for projects. It really just serves as a base environment that you might use to quickly test out some code snippet for example. So, let’s now see how we actually create a new environment that we can use for a particular project. Therefore, we type: See slide 2 The actual command is “conda create”. The argument “--name” specifies the name of the environment, in this case it is “iris_prediction” (preferably the name should somewhat describe our project). And then, after that the packages are listed that we want to install within this environment, namely Scikit-learn and Pandas. So, as you can see, we can list several libraries for this command. And ideally, that’s what we always do when creating a new environment. This way, it is less likely that dependency conflicts occur because the dependencies can be sorted out at once compared to creating the environment first and then installing the libraries one by one (how to do that, we will see later on). And in case we want to install a specific version of a package or of Python, we can specify it like this:
And now, if we run the command, then Conda will list all the packages that will be downloaded and installed, as well as asking us if we want to proceed. See slide 3 Type “y” and hit enter to create the environment. Okay, so now that we have created a new environment, let’s activate it. Therefore, we just run the command: See slide 4 If we do that, then the name in the parentheses at the front of the command prompt changes to “iris_prediction”. See slide 5 So, we are now in this particular environment. And now, let’s say we work on the code for our project. And then, once we are finished, we want to share that code. Therefore, as explained earlier, we also want to share our specific environment so that people can recreate it. To do that, let’s first go to the desktop by typing “cd Desktop” and then we run the following command: See slide 6 The actual command is “conda env export” and the argument “--file” specifies the name of the file that we want to create, namely “environment.yml”. If we run this command, it will create the file within the current directory. So, it will be saved to the desktop. And this file we can then share together with our code on GitHub for example. If we open the “environment.yml” file, then we can see that the name of our environment is stated at the top. See slide 7 After that, the channels that we used to install the libraries are listed (more on that later). And then, the actual dependencies are listed. So now, let’s see how we can create a new environment from this file. Therefore, let’s rename the environment to “iris_prediction_2”. See slide 8 And then, let’s run the following command to create the environment that is specified in “environment.yml” (the file has to be in the same directory that we are currently in). See slide 9 The actual command is “conda env create” and the argument “--file” specifies the file from which we want to create the environment, namely “environment.yml”. And now, to see if that environment was created, let’s list all of our available environments with the "conda env list"-command. See slide 10 And, as you can see, we have 3 environments in total. The “base” environment that is available from the start. The “iris_prediction” environment that we created earlier. And the “iris_prediction_2” environment that we created from the “environment.yml” file. The star, by the way, indicates which environment is currently active which is the “iris_prediction” environment. Okay so now, let’s see how we can remove an environment in case we don’t need it anymore. And we are going to remove the “iris_prediction_2” environment. See slide 11 The actual command is “conda env remove” and the argument “--name” specifies which environment we want to delete, in this case “iris_prediction_2”. If we now run “conda env list” again, we can see that we only have 2 environments left and that “iris_prediction_2” is gone. See slide 12 And that’s already it. Those are, in my opinion, the most common environment-specific commands that one might use on a regular basis (except for one that we will mention later). So now, let’s look at the most common commands for managing packages. Package-specific commandsThe first one deals with actually installing packages. So, let’s install Matplotlib. See slide 13 And for this command, as before when we created a new environment, we could also list several packages at once (which, again, should be preferably done) or we could specify a specific version of a package if we wanted to. And then, when we run the command, it will install the respective package into the currently active environment. So, that’s how we can simply install a package. Sometimes, however, a package is not available via this command, for example (currently) the package “mnist”. Then, we will get the message to go to Anaconda.org. See slides 14-15 There, we can search for that package and see if it is available over a different channel. See slide 16 In the search results the channel name is stated in the “Package” column. It is the name in front of the forward slash (the “owner” entry in the “Package” column). The entry after the forward slash is the name of the actual package. In the last column, “Platforms”, the operating systems are stated, for which the respective installation is applicable. The entry “noarch” (no architecture) means that you can use it for any operating system. So, for that reason, we install the mnist package from the second row by running the following command: See slide 17 The argument “--channel” specifies which channel we want to use, “conda-forge” in this case. “Conda-forge” is probably one of the biggest channels and you will probably see it quite often. They even have their own website. So, they are generally a reliable source for installing packages. In rare cases, a package might also not be available via Anaconda.org. Or if it does, then the channels might only have a low number of downloads (so they might potentially be not that trustworthy). An example for such a package is (currently) “discord.py”. See slide 18 In those cases, we can use pip to install the respective package (which is also the suggested way to install discord.py). Therefore, we first need to make sure that pip is installed in our current environment. See slide 19 The “conda list” command simply lists all the packages of the current environment. See slide 20 If pip isn’t listed, then we need to install it first. When it is installed we can run the following command: See slide 21 So to recap, those are the three ways in which we can install packages. And ideally we should try to install them in this order. First we should try “conda install”. If that doesn’t work, we use “conda install --channel”. And if that also doesn’t work, we use “pip install”. And another guideline is that we should first use Conda to install as many packages as we can. And only then, use pip for the remaining packages that we still need. And that’s because running “conda install”, after “pip install” has been used, can cause dependency issues. So, that’s one thing we have to keep in mind. Okay and now, there are basically just two things left that we can do with a respective package. Namely, we can uninstall it or we can update it. And the commands for that are pretty straight forward. So, to uninstall a package, we simply say: See slide 22 This will uninstall the Matplotlib library. And now, let’s see how we update a package. Therefore, let’s first deactivate the current environment by saying: See slide 23 That’s the one environment-specific command that was missing earlier and it will bring us back to the base environment. See slide 24 And then, let’s run the following command: See slide 25 This will update the “conda” package itself. And that is something that we should do regularly for our “base” environment to make sure that we are always using the most current version of Conda. And that’s it. Those are, in my opinion, the most common commands that we need to know to manage packages with Conda. ConclusionAnd now lastly, let’s have a look at all the commands that we have covered in this tutorial.
See slide 26 This image hopefully makes it clear why I first explained what a package manager does and what an environment manager does. Namely, if you know what they do and what they are for, then the actual Conda commands, that you need to use, are pretty straight forward. And if you want to do something that we haven’t covered in this tutorial like for example cloning one of your environments, you can also just check out the Conda documentation.
0 Comments
Leave a Reply. |
AuthorJust someone trying to explain his understanding of data science concepts Archives
November 2020
Categories
|