Data Science and Data Proicessing

의견

안녕 파이썬. 안녕하세요 Julia!

Python의 수명이 멈춤에 따라 새로운 경쟁자가 등장합니다.

Woman with hat covering her face in front of sunset
줄리아가 여전히 당신에게 미스터리라면 걱정하지 마세요. ~의 사진줄리아 시저의 위에Unsplash

파이썬의 선과 줄리아의 탐욕

못생긴 것보다 아름다운 것이 낫습니다.
명시적인 것이 암시적인 것보다 낫습니다.
단순한 것이 복잡한 것보다 낫습니다.
복잡한 것이 복잡한 것보다 낫습니다.
플랫이 중첩보다 낫습니다.
스파 스는 조밀 한 것보다 낫습니다.
가독성이 중요합니다.
특별한 경우는 규칙을 어길만큼 특별하지 않습니다.
실용성이 순결을 능가하지만.
[...]
ABC는 줄리아를위한 길을 닦고있는 파이썬을위한 길을 열었습니다. ~의 사진데이비드 발류의 위에Unsplash
우리는 더 많은 것을 원합니다.우리는 자유 라이선스가있는 오픈 소스 언어를 원합니다. Ruby의 역동 성과 함께 C의 속도를 원합니다. Lisp와 같은 진정한 매크로를 사용하지만 Matlab과 같은 명확하고 친숙한 수학적 표기법을 사용하는 동음이의 언어를 원합니다. 우리는 Python처럼 일반 프로그래밍에 유용하고, R만큼 쉬운 통계, Perl만큼 자연스러운 문자열 처리, Matlab처럼 선형 대수에 대해 강력하고, 프로그램을 셸처럼 결합하는 데 능숙한 것을 원합니다. 배우기 쉽지만 가장 심각한 해커를 행복하게 만드는 것. 우리는 그것이 상호 작용하고 컴파일되기를 원합니다.

Julia 개발자가 좋아하는 것

다재

속도

커뮤니티

코드 변환

Image for post
라이브러리는 여전히 Python의 강점입니다. ~의 사진수잔 인의 위에Unsplash

도서관

동적 및 정적 유형

데이터 : 작지만 투자

Image for post
Image for post
Julia (왼쪽) 및 Python (오른쪽) 태그가 지정된 질문 수스택 오버플로.
Lots of ones and zeroes on screen, forming a red heart
Julia에게 사랑을 보여줄 시간입니다. ~의 사진알렉산더 신의 위에Unsplash

요점 : 줄리아를하고 그것이 당신의 우위가되게하십시오

우리가 용납 할 수없는 욕심이 많다는 것을 알고 있지만 우리는 여전히 모든 것을 갖고 싶어합니다. 약 2 년 반 전에 우리는 탐욕의 언어를 만들기 시작했습니다. 완전하지는 않지만 1.0 릴리즈를 할 때입니다. 우리가 만든 언어는줄리아. 그것은 이미 우리의 불의한 요구의 90 %를 제공하고 있으며, 이제는 그것을 더 구체화하기 위해 다른 사람들의 불의한 요구가 필요합니다. 그래서 만약 당신이 탐욕스럽고 비합리적이고 까다로운 프로그래머라면, 우리는 당신이 시도해보기를 바랍니다.

OPINION

Bye-bye Python. Hello Julia!

As Python’s lifetime grinds to a halt, a hot new competitor is emerging

Woman with hat covering her face in front of sunset
If Julia is still a mystery to you, don’t worry. Photo by Julia Caesar on Unsplash

The Zen of Python versus the Greed of Julia

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
[...]
ABC paved the way for Python, which is paving the way for Julia. Photo by David Ballew on Unsplash
We are greedy: we want more.We want a language that's open source, with a liberal license. We want the speed of C with the dynamism of Ruby. We want a language that's homoiconic, with true macros like Lisp, but with obvious, familiar mathematical notation like Matlab. We want something as usable for general programming as Python, as easy for statistics as R, as natural for string processing as Perl, as powerful for linear algebra as Matlab, as good at gluing programs together as the shell. Something that is dirt simple to learn, yet keeps the most serious hackers happy. We want it interactive and we want it compiled.

What Julia developers are loving

Versatility

Speed

Community

Code conversion

Image for post
Libraries are still a strong point of Python. Photo by Susan Yin on Unsplash

Libraries

Dynamic and static types

The data: Invest in things while they’re small

Image for post
Image for post
Number of questions tagged Julia (left) and Python (right) on StackOverflow.
Lots of ones and zeroes on screen, forming a red heart
It’s time to show Julia some love. Photo by Alexander Sinn on Unsplash

Bottom line: Do Julia and let it be your edge

Even though we recognize that we are inexcusably greedy, we still want to have it all. About two and a half years ago, we set out to create the language of our greed. It's not complete, but it's time for a 1.0 release — the language we've created is called Julia. It already delivers on 90% of our ungracious demands, and now it needs the ungracious demands of others to shape it further. So, if you are also a greedy, unreasonable, demanding programmer, we want you to give it a try.

Python Lambda Expressions in Data Science

Upgrade your python coding standards to upgrade your research

Sep 2 · 3 min read
Photo by Max Baskakov on Unsplash

Coding efficiently is one of the key premises to the use case of Python and Lambda expressions are no different. Python lambda’s are anonymous functions which involve small and concise syntax, whereas at times, regular functions can be too descriptive and quite long.

Python is one of a few languages which had lambda functions added to their syntax whereas other languages, like Haskell, uses lambda expressions as a core concept.

Whatever your use-case of a Lambda function, it’s really good to know what they’re about and how to use them.

Why Use Lambda Functions?

The true power of a lambda function can be shown when used inside another function but let’s start on the easy step.

Say you have a function definition that takes one argument, and that argument will be added to an unknown number:

def identity(x):
... return x + 10

However this can be compressed into a simple one-liner as follows:

identity = lambda a : a + 10

This function can then be used as follows:

identity(10) 

which will give the answer 20.

Now with this simple concept, we can also extend this to have more than one input as follows:

myfunc = lambda a, b, c : a + b + c

So the following:

myfunc(2,3,4)

Will give the function 9. It’s really that simple!

Now a really cool use case of Lambda expressions occurs when you use lambda functions within functions. Take the following example:

def myfunc(n):
return lambda a : a * n

Here, the function myfunc returns a lambda function which multiplies the input a by a pre-defined integer, n. This allows the user to create functions on the fly:

mydoubler = myfunc(2)
mytripler = myfunc(3)

As can be seen, the function mydoubler is a function that simply defines an input by the number 2, whereas mytripler multiplies an input by 3. Test it out!

print(mydoubler(11))
print(mytripler(11))

This brings about the answers 22 and 33.

Photo by Ian Stauffer on Unsplash

Are Lambdas Pythonic or Not?

According to the style-guide of Python (PEP 8), it describes the following which actually recommends users TO NOT use Lambda expressions:

Always use a def statement instead of an assignment statement that binds a lambda expression directly to an identifier.

Yes:

def f(x): 
return 2*x

No:

f = lambda x: 2*x

The logic around this is probably more to do with readability than any personal vendetta against lambda expressions. Agreeably, they can make it a bit more difficult to understand the use case but as a coder who prefers efficiency and simplicity in code, I do feel that there’s a place for them.

However, readable code has to be the most important feature of any code — debatably more important than efficiently run code.

Example Math Formulas

Mean:

mu = lambda x: sum(x) / len(x)

Variance:

variance = lambda x: sum((x - mu(x))**2) / (len(x) - 1)

Thanks for reading! If you have any messages, please let me know!

Keep up to date with my latest articles here!

Image for post

The New Jupyter Book

Jupyter Book extends the notebook idea

Aug 25 · 11 min read

2020–08–07 | On the Jupyter blog, Chris Holdgraf announces a rewrite of the Jupyter Book project.

Jupyter Book is an open source project for building beautiful, publication-quality books, websites, and documents from source material that contains computational content. With this post, we’re happy to announce that Jupyter Book has been re-written from the ground up, making it easier to install, faster to use, and able to create more complex publishing content in your books. It is now supported by the Executable Book Project, an open community that builds open source tools for interactive and executable documents in the Jupyter ecosystem and beyond.”

Image for post
Source: https://jupyterbook.org/

What does the new Jupyter Book do?

The new version of Jupyter Book will feel very similar. However, it has a lot of new features due to the new Jupyter Book stack underneath (more on that later).

The new Jupyter Book has the following main features (with links to the relevant documentation for each):

Write publication-quality content in markdown
You can write in either Jupyter markdown, or an extended flavor of markdown with publishing features. This includes support for rich syntax such as citations and cross-references, math and equations, and figures.

Write content in Jupyter Notebooks
This allows you to include your code and outputs in your book. You can also write notebooks entirely in markdown to execute when you build your book.

Execute and cache your book’s content
For .ipynb and markdown notebooks, execute code and insert the latest outputs into your book. In addition, cache and re-use outputs to be used later.

Insert notebook outputs into your content
Generate outputs as you build your documentation, and insert them in-line with your content across pages.

Add interactivity to your book
You can toggle cell visibility, include interactive outputs from Jupyter, and connect with online services like Binder.

Generate a variety of outputs
This includes single- and multi-page websites, as well as PDF outputs.

Build books with a simple command-line interface
You can quickly generate your books with one command, like so: jupyter-book build mybook/

These are just a few of the major changes that we’ve made. For a more complete idea of what you can do, check out the Jupyter Book documentation

An enhanced flavor of markdown

The biggest enhancement to Jupyter Book is support for the MyST Markdown language. MyST stands for “Markedly Structured Text”, and is a flavor of markdown that implements all of the features of the Sphinx documentation engine, allowing you to write scientific publications in markdown. It draws inspiration from RMarkdown and the reStructuredText ecosystem of tools. Anything you can do in Sphinx, you can do with MyST as well.

MyST Markdown is a superset of Jupyter Markdown (AKA, CommonMark), meaning that any default markdown in a Jupyter Notebook is valid in Jupyter Book. If you’d like extra features in markdown such as citations, figures, references, etc, then you may include extra MyST Markdown syntax in your content.

For example, here’s how you can include a citation in the new Jupyter Book:

Image for post
A sample citation. Here we see how you can include citation syntax in-line with your markdown, and then insert a bibliography later on in your page. (source: https://executablebooks.org/)

A smarter build system

While the old version of Jupyter Book used a combination of Python and Jekyll to build your book’s HTML, the new Jupyter Book uses Python all the way through. This means that building the HTML for your book is as simple as:

jupyter-book build mybookname/

In addition, the new build system leverages Jupyter Cache to execute notebook content only if the code is updated, and to insert the outputs from the cache at build time. This saves you time by avoiding the need to re-execute code that hasn’t been changed.

Image for post
An example build process. Here the jupyter-book command-line interface is used to convert a collection of content into an HTML book. (source: https://blog.jupyter.org/)

More book output types

By leveraging Sphinx, Jupyter Book will be able to support more complex outputs than just an HTML website. For example, we are currently prototyping PDF Outputs, both via HTML as well as via LaTeX. This gives Jupyter Book more flexibility to generate the right book for your use case.

You can also run Jupyter Book on individual pages. This means that you can write single-page content (like a scientific article) entirely in Markdown.

A new stack

The biggest change under-the-hood is that Jupyter Book now uses the Sphinx documentation engine instead of Jekyll for building books. By leveraging the Sphinx ecosystem, Jupyter Book can more effectively build on top of community tools, and can contribute components back to the broader community.

Instead of being a single repository, the old Jupyter Book repository has now been separated into several modular tools. Each of these tools can be used on their own in your Sphinx documentation, and they can be coordinated together via Jupyter Book:

  • The MyST markdown parser for Sphinx allows you to write fully-featured Sphinx documentation in Markdown.
  • MyST-NB is an .ipynb parser for Sphinx that allows you to use MyST Markdown in your notebooks. It also provides tools for execution, cacheing, and variable insertion of Jupyter Notebooks in Sphinx.
  • The Sphinx Book Theme is a beautiful book-like theme for Sphinx, build on top of the PyData Sphinx Theme.
  • Jupyter Cache allows you to execute a collection of notebooks and store their outputs in a hashed database. This lets you cache your notebook’s output without including it in the .ipynb file itself.
  • Sphinx-Thebe converts your “static” HTML page into an interactive page with code cells that are run remotely by a Binder kernel.
  • Finally, Jupyter Book also supports a growing collection of Sphinx extensions, such as sphinx-copybutton, sphinx-togglebutton, sphinx-comments, and sphinx-panels.

What next?

Jupyter Book and its related projects will continue to be developed as a part of the Executable Book Project, a community that builds open source tools for high-quality scientific publications from computational content in the Jupyter ecosystem and beyond.

Photo by Markus Winkler on Unsplash

Overview and installation

Install the command-line interface

First off, make sure you have the CLI installed so that you can work with Jupyter Book. The Jupyter-Book CLI allows you to build and control your Jupyter Book. You can install it via pip with the following command:

pip install -U jupyter-book

The book building process

Building a Jupyter Book broadly consists of two steps:

Put your book content in a folder or a file. Jupyter Book needs the following pieces in order to build your book:

  • Your content file(s) (the pages of your book) in either markdown or Jupyter Notebooks.
  • A Table of Contents YAML file (_toc.yml) that defines the structure of your book. Mandatory when building a folder.
  • (optional) A configuration file (_config.yml) to control the behavior of Jupyter Book.

Build your book. Using Jupyter Book’s command-line interface you can convert your pages into either an HTML or a PDF book.

Host your book’s HTML online. Once your book’s HTML is built, you can host it online as a public website. See Publish your book online for more information.

Create a template Jupyter Book

We’ll use a small template book to show what kinds of files you might put inside your own. To create a new Jupyter Book, type the following at the command-line:

jupyter-book create mybookname

A new book will be created at the path that you’ve given (in this case, mybookname/).

If you would like to quickly generate a basic Table of Contents YAML file, run the following command:

jupyter-book toc mybookname/

And it will generate a TOC for you. Note that there must be at least one content file in each folder in order for any sub-folders to be parsed.

Inspecting your book’s contents

Let’s take a quick look at some important files in the demo book you created:

mybookname/
├── _config.yml
├── _toc.yml
├── content.md
├── intro.md
├── markdown.md
├── notebooks.ipynb
└── references.bib

Here’s a quick rundown of the files you can modify for yourself, and that ultimately make up your book.

Book configuration

All of the configuration for your book is in the following file:

mybookname/
├── _config.yml

You can define metadata for your book (such as its title), add a book logo, turn on different “interactive” buttons (such as a Binder button for pages built from a Jupyter Notebook), and more.

Table of Contents

Jupyter Book uses your Table of Contents to define the structure of your book. For example, your chapters, sub-chapters, etc.

The Table of Contents lives at this location:

mybookname/
├── _toc.yml

This is a YAML file with a collection of pages, each one linking to a file in your content/ folder. Here’s an example of a few pages defined in toc.yml.

- file: features/features
sections:
- file: features/markdown
- file: features/notebooks

The top-most level of your TOC file are book chapters. Above, this is the “Features” page. Note that in this case the title of the page is not explicitly specified but is inferred from the source files. This behavior is controlled by the page_titles setting in _config.yml (see Files for more details). Each chapter can have several sections (defined in sections:) and each section can have several sub-sections. For more information about how section structure maps onto book structure, see How headers and sections map onto to book structure.

Each item in the _toc.yml file points to a single file. The links should be relative to your book’s folder and with no extension.

For example, in the example above there is a file in mybookname/content/notebooks.ipynb. The TOC entry that points to this file is here:

- file: features/notebooks

Book content

The markdown and ipynb files in your folder is your book’s content. Some content files for the demo book are shown below:

mybookname/
...
├── content.md
└── notebooks.ipynb

Note that the content files are either Jupyter Notebooks or Markdown files. These are the files that define “sections” in your book.

You can store these files in whatever collection of folders you’d like, note that the structure of your book when it is built will depend solely on the order of items in your _toc.yml file (see below section)

Book bibliography for citations

If you’d like to build a bibliography for your book, you can do so by including the following file:

mybookname/
└── references.bib

This BiBTex file can be used to insert citations into your book’s pages. For more information, see Citations and cross-references.

Next step: build your book

Now that you’ve got a Jupyter Book folder structure, we can create the HTML (or PDF) for each of your book’s pages.

Build your book

Once you’ve added content and configured your book, it’s time to build outputs for your book. We’ll use the jupyter-book build command-line tool for this.

Currently, there are two kinds of supported outputs: an HTML website for your book, and a PDF that contains all of the pages of your book that is built from the book HTML.

Prerequisites

In order to build the HTML for each page, you should have followed the steps in creating your Jupyter Book structure. You should have a collection of notebook/markdown files in your mybookname/ folder, a _toc.yml file that defines the structure of your book, and any configuration you’d like in the _config.yml file.

Build your book’s HTML

Now that your book’s content is in your book folder and you’ve defined your book’s structure in _toc.yml, you can build the HTML for your book.

Note: HTML is the default builder.

Do so by running the following command:

jupyter-book build mybookname/

This will generate a fully-functioning HTML site using a static site generator. The site will be placed in the _build/html folder. You can then open the pages in the site by entering that folder and opening the html files with your web browser.

Note: You can also use the short-hand jb for jupyter-book. E.g.,: jb build mybookname/.

Build a standalone page

Sometimes you’d like to build a single page of content rather than an entire book. For example, if you’d like to generate a web-friendly HTML page from a Jupyter Notebook for a report or publication.

You can generate a standalone HTML file for a single page of the Jupyter Book using the same command :

jupyter-book build path/to/mypage.ipynb

This will execute your content and output the proper HTML in a _build/html folder.

Your page will be called mypage.html. This will work for any content source file that is supported by Jupyter Book.

Note: Users should note that building single pages in the context of a larger project, can trigger warnings and incomplete links. For example, building docs/start/overview.md will issue a bunch of unknown document,term not in glossary, and undefined links warnings.

Page caching

By default, Jupyter Book will only build the HTML for pages that have been updated since the last time you built the book. This helps reduce the amount of unnecessary time needed to build your book. If you’d like to force Jupyter Book to re-build a particular page, you can either edit the corresponding file in your book’s folder, or delete that page’s HTML in the _build/html folder.

Local preview

To preview your book, you can open the generated HTML files in your browser. Either double-click the html file in your local folder, or enter the absolute path to the file in your browser navigation bar adding file:// at the beginning (e.g. file://Users/my_path_to_book/_build/index.html).

Next step: publish your book

Now that you’ve created the HTML for your book, it’s time to publish it online.

Publish your book online

Once you’ve built the HTML for your book, you can host it online. The best way to do this is with a service that hosts static websites (because that’s what you have just created with Jupyter Book). There are many options for doing this, and these sections cover some of the more popular ones.

Create an online repository for your book

Regardless of the approach you use for publishing your book online, it will require you to host your book’s content in an online repository such as GitHub. This section describes one approach you can use to create your own GitHub repository and add your book’s content to it.

  1. First, log-in to GitHub, then go to the “create a new repository” page:https://github.com/new
  2. Next, give your online repository a name and a description. Make your repository public and do not initialize with a README file, then click “Create repository”.
  3. Now, clone the (currently empty) online repository to a location on your local computer. You can do this via the command line with:
git clone https://github.com/<my-org>/<my-repository-name>

4. Copy all of your book files and folders into this newly cloned repository. For example, if you created your book locally with jupyter-book create mylocalbook and your new repository is called myonlinebook, you could do this via the command line with:

cp -r mylocalbook/* myonlinebook/

5. Now you need to sync your local and remote (i.e., online) repositories. You can do this with the following commands:

cd myonlinebook
git add ./*
git commit -m "adding my first book!"
git push

Thanks so much for your interest in my post!

If it was useful for you, please remember toClap” 👏 it so other people can also benefit from it.

If you have any suggestions or questions, please leave a comment!



Rank Game Publisher
1 리니지M NCSOFT
2 리니지2M NCSOFT
3 바람의나라: 연 NEXON Company
4 R2M Webzen Inc.
5 기적의 검 4399 KOREA
6 뮤 아크엔젤 Webzen Inc.
7 KartRider Rush+ NEXON Company
8 가디언 테일즈 Kakao Games Corp.
9 V4 NEXON Company
10 블레이드&소울 레볼루션 Netmarble
11 라이즈 오브 킹덤즈 LilithGames
12 라그나로크 오리진 GRAVITY Co., Ltd.
13 일루전 커넥트 ChangYou
14 리니지2 레볼루션 Netmarble
15 Epic Seven Smilegate Megaport
16 그랑삼국 YOUZU(SINGAPORE)PTE.LTD.
17 A3: 스틸얼라이브 Netmarble
18 AFK 아레나 LilithGames
19 FIFA ONLINE 4 M by EA SPORTS™ NEXON Company
20 스테리테일 4399 KOREA
21 FIFA Mobile NEXON Company
22 슬램덩크 DeNA HONG KONG LIMITED
23 PUBG MOBILE PUBG CORPORATION
24 동방불패 모바일 Perfect World Korea
25 Lords Mobile: Kingdom Wars IGG.COM
26 Roblox Roblox Corporation
27 마구마구 2020 Netmarble
28 메이플스토리M NEXON Company
29 Age of Z Origins Camel Games Limited
30 왕좌의게임:윈터이즈커밍 YOOZOO GAMES KOREA CO., LTD.
31 Gardenscapes Playrix
32 Rise of Empires: Ice and Fire Long Tech Network Limited
33 검은사막 모바일 PEARL ABYSS
34 Empires & Puzzles: Epic Match 3 Small Giant Games
35 Pmang Poker : Casino Royal NEOWIZ corp
36 한게임 포커 NHN BIGFOOT
37 황제라 칭하라 Clicktouch Co., Ltd.
38 Summoners War Com2uS
39 Brawl Stars Supercell
40 에오스 레드 BluePotion Games
41 Homescapes Playrix
42 일곱 개의 대죄: GRAND CROSS Netmarble
43 Random Dice: PvP Defense 111%
44 Lord of Heroes CloverGames
45 케페우스M Ujoy Games
46 파이브스타즈 SkyPeople
47 Teamfight Tactics: League of Legends Strategy Game Riot Games, Inc
48 카이로스 : 어둠을 밝히는 자 Longtu Korea Inc.
49 Last Shelter: Survival Long Tech Network Limited
50 랑그릿사 ZlongGames

Bringing the best out of Jupyter Notebooks for Data Science

Enhance Jupyter Notebook’s productivity with these Tips & Tricks.

Reimagining what a Jupyter notebook can be and what can be done with it.


Table of Contents


1. Executing Shell Commands

In [1]: !ls
example.jpeg list tmp
In [2]: !pwd
/home/Parul/Desktop/Hello World Folder'
In [3]: !echo "Hello World"
Hello World
In [4]: files= !lsIn [5]: print(files)
['example.jpeg', 'list', 'tmp']
In [6]: directory = !pwdIn [7]: print(directory)
['/Users/Parul/Desktop/Hello World Folder']
In [8]: type(directory)
IPython.utils.text.SList

2. Jupyter Themes

pip install jupyterthemes
jt -l
# selecting a particular themejt -t <name of the theme># reverting to original Themejt -r
Left: original | Middle: Chesterish Theme | Right: solarizedl theme
Image for post

3. Notebook Extensions

Installation

conda install -c conda-forge jupyter_nbextensions_configurator
pip install jupyter_contrib_nbextensions && jupyter contrib nbextension install#incase you get permission errors on MacOS,pip install jupyter_contrib_nbextensions && jupyter contrib nbextension install --user
Image for post

1. Hinterland

Image for post

2. Snippets

Image for post

3. Split Cells Notebook

Image for post

4. Table of Contents

Image for post

5. Collapsible Headings

Image for post

6. Autopep8

Image for post

4. Jupyter Widgets

Installation

# pip
pip install ipywidgets
jupyter nbextension enable --py widgetsnbextension
# Conda
conda install -c conda-forge ipywidgets
#Installing ipywidgets with conda automatically enables the extension
# Start with some imports!from ipywidgets import interact
import ipywidgets as widgets

1. Basic Widgets

def f(x):
return x
# Generate a slider
interact(f, x=10,);
Image for post
# Booleans generate check-boxes
interact(f, x=True);
Image for post
# Strings generate text areas
interact(f, x='Hi there!');
Image for post

2. Advanced Widgets

Play Widget

play = widgets.Play(
# interval=10,
value=50,
min=0,
max=100,
step=1,
description="Press play",
disabled=False
)
slider = widgets.IntSlider()
widgets.jslink((play, 'value'), (slider, 'value'))
widgets.HBox([play, slider])
Image for post

Date picker

widgets.DatePicker(
description='Pick a Date',
disabled=False
)
Image for post

Color picker

widgets.ColorPicker(
concise=False,
description='Pick a color',
value='blue',
disabled=False
)
Image for post

Tabs

tab_contents = ['P0', 'P1', 'P2', 'P3', 'P4']
children = [widgets.Text(description=name) for name in tab_contents]
tab = widgets.Tab()
tab.children = children
for i in range(len(children)):
tab.set_title(i, str(i))
tab
Image for post

5. Qgrid

Installation

pip install qgrid
jupyter nbextension enable --py --sys-prefix qgrid
# only required if you have not enabled the ipywidgets nbextension yet
jupyter nbextension enable --py --sys-prefix widgetsnbextension
# only required if you have not added conda-forge to your channels yet
conda config --add channels conda-forge
conda install qgrid
Image for post

6. Slideshow

1. Jupyter Notebook’s built-in Slide option

Image for post
jupyter nbconvert *.ipynb --to slides --post serve
# insert your notebook name instead of *.ipynb
Image for post
Image for post

2. Using the RISE plugin

conda install -c damianavila82 rise
pip install RISE
jupyter-nbextension install rise --py --sys-prefix#enable the nbextension:
jupyter-nbextension enable rise --py --sys-prefix
Image for post
Image for post

6. Embedding URLs, PDFs, and Youtube Videos

URLs

#Note that http urls will not be displayed. Only https are allowed inside the Iframefrom IPython.display import IFrame
IFrame('https://en.wikipedia.org/wiki/HTTPS', width=800, height=450)
Image for post

PDFs

from IPython.display import IFrame
IFrame('https://arxiv.org/pdf/1406.2661.pdf', width=800, height=450)
Image for post

Youtube Videos

from IPython.display import YouTubeVideoYouTubeVideo('mJeNghZXtMo', width=800, height=300)
Image for post

Conclusion

Rank Game Publisher
1 리니지M NCSOFT
2 리니지2M NCSOFT
3 바람의나라: 연 NEXON Company
4 R2M Webzen Inc.
5 기적의 검 4399 KOREA
6 뮤 아크엔젤 Webzen Inc.
7 KartRider Rush+ NEXON Company
8 V4 NEXON Company
9 블레이드&소울 레볼루션 Netmarble
10 라그나로크 오리진 GRAVITY Co., Ltd.
11 가디언 테일즈 Kakao Games Corp.
12 라이즈 오브 킹덤즈 LilithGames
13 일루전 커넥트 ChangYou
14 리니지2 레볼루션 Netmarble
15 Epic Seven Smilegate Megaport
16 A3: 스틸얼라이브 Netmarble
17 그랑삼국 YOUZU(SINGAPORE)PTE.LTD.
18 AFK 아레나 LilithGames
19 FIFA ONLINE 4 M by EA SPORTS™ NEXON Company
20 스테리테일 4399 KOREA
21 슬램덩크 DeNA HONG KONG LIMITED
22 FIFA Mobile NEXON Company
23 PUBG MOBILE PUBG CORPORATION
24 Lords Mobile: Kingdom Wars IGG.COM
25 동방불패 모바일 Perfect World Korea
26 메이플스토리M NEXON Company
27 Roblox Roblox Corporation
28 마구마구 2020 Netmarble
29 왕좌의게임:윈터이즈커밍 YOOZOO GAMES KOREA CO., LTD.
30 Age of Z Origins Camel Games Limited
31 Gardenscapes Playrix
32 Rise of Empires: Ice and Fire Long Tech Network Limited
33 검은사막 모바일 PEARL ABYSS
34 Pmang Poker : Casino Royal NEOWIZ corp
35 Empires & Puzzles: Epic Match 3 Small Giant Games
36 한게임 포커 NHN BIGFOOT
37 Summoners War Com2uS
38 황제라 칭하라 Clicktouch Co., Ltd.
39 Brawl Stars Supercell
40 Homescapes Playrix
41 에오스 레드 BluePotion Games
42 Random Dice: PvP Defense 111%
43 Teamfight Tactics: League of Legends Strategy Game Riot Games, Inc
44 일곱 개의 대죄: GRAND CROSS Netmarble
45 Lord of Heroes CloverGames
46 케페우스M Ujoy Games
47 Last Shelter: Survival Long Tech Network Limited
48 카이로스 : 어둠을 밝히는 자 Longtu Korea Inc.
49 컴투스프로야구2020 Com2uS
50 궁3D WISH INTERACTIVE TECHNOLOGY LIMITED

Please Stop Doing These 5 Things in Pandas

These mistakes are super common and super easy to fix.

As someone who did over a decade of development before moving into Data Science, there’s a lot of mistakes I see data scientists make while using Pandas. The good news is these are really easy to avoid, and fixing them can also make your code more readable.

Image for post
Photo by Daniela Holzer on Unsplash

Mistake 1: Getting or Setting Values Slowly

It’s nobody’s fault that there are way too many ways to get and set values in Pandas. In some situations, you have to find a value using only an index or find the index using only the value. However, in many cases, you’ll have many different ways of selecting data at your disposal: index, value, label, etc.

In those situations, I prefer to use whatever is fastest. Here are some common choices from slowest to fastest, which shows you could be missing out on a 195% gain!

Tests were run using a DataFrame of 20,000 rows. Here’s the notebook if you want to run it yourself.

# .at - 22.3 seconds
for i in range(df_size):
df.at[i] = profile
Wall time: 22.3 s
# .iloc - 15% faster than .at
for i in range(df_size):
df.iloc[i] = profile
Wall time: 19.1 s
# .loc - 30% faster than .at
for i in range(df_size):
df.loc[i] = profile
Wall time: 16.5 s
# .iat, doesn't work for replacing multiple columns of data.
# Fast but isn't comparable since I'm only replacing one column.
for i in range(df_size):
df.iloc[i].iat[0] = profile['address']
Wall time: 3.46 s
# .values / .to_numpy() - 195% faster than .at
for i in range(df_size):
df.values[i] = profile
# Recommend using to_numpy() instead if you have Pandas 1.0+
# df.to_numpy()[i] = profile
Wall time: 254 ms

(As Alex Bruening and miraculixx noted in the comments, for loops are not the ideal way to perform actions like this, look at .apply(). I’m using them here purely to prove the speed difference of the line inside the loop.)

Mistake 2: Only Using 25% of Your CPU

Whether you’re on a server or just your laptop, the vast majority of people never use all the computing power they have. Most processors (CPUs) have 4 cores nowadays, and by default, Pandas will only ever use one.

From the Modin Docs, a 4x speedup on a 4 core machine.

Modin is a Python module built to enhance Pandas by making way better use of your hardware. Modin DataFrames don’t require any extra code and in most cases will speed up everything you do to DataFrames by 3x or more.

Modin acts as more of a plugin than a library since it uses Pandas as a fallback and cannot be used on its own.

The goal of Modin is to augment Pandas quietly and let you keep working without learning a new library. The only line of code most people will need is import modin.pandas as pd replacing your normal import pandas as pd, but if you want to learn more check out the documentation here.

In order to avoid recreating tests that have already been done, I’ve included this picture from the Modin documentation showing how much it can speed up the read_csv() function on a standard laptop.

Please note that Modin is in development, and while I use it in production, you should expect some bugs. Check the Issues in GitHub and the Supported APIs for more information.

Mistake 3: Making Pandas Guess Data Types

When you import data into a DataFrame and don’t specifically tell Pandas the columns and datatypes, Pandas will read the entire dataset into memory just to figure out the data types.

For example, if you have a column full of text Pandas will read every value, see that they’re all strings, and set the data type to “string” for that column. Then it repeats this process for all your other columns.

You can use df.info() to see how much memory a DataFrame uses, that’s roughly the same amount of memory Pandas will consume just to figure out the data types of each column.

Unless you’re tossing around tiny datasets or your columns are changing constantly, you should always specify the data types. In order to do this, just add the dtypes parameter and a dictionary with your column names and their data types as strings. For example:

pd.read_csv(‘fake_profiles.csv’, dtype={
‘job’: ‘str’,
‘company’: ‘str’,
‘ssn’: ‘str’
})

Note: This also applies to DataFrames that don’t come from CSVs.

Mistake 4: Leftover DataFrames

One of the best qualities of DataFrames is how easy they are to create and change. The unfortunate side effect of this is most people end up with code like this:

# Change dataframe 1 and save it into a new dataframedf1 = pd.read_csv(‘file.csv’)df2 = df1.dropna()df3 = df2.groupby(‘thing’)

What happens is you leave df2 and df1 in Python memory, even though you’ve moved on to df3. Don’t leave extra DataFrames sitting around in memory, if you’re using a laptop it’s hurting the performance of almost everything you do. If you’re on a server, it’s hurting the performance of everyone else on that server (or at some point, you’ll get an “out of memory” error).

Instead, here are some easy ways to keep your memory clean:

  • Use df.info() to see how much memory a DataFrame is using
  • Install plugin support in Jupyter, then install the Variable Inspector plugin for Jupyter. If you’re used to having a variable inspector in R-Studio, you should know that R-Studio now supports Python!
  • If you’re in a Jupyter session already, you can always erase variables without restarting by using del df2
  • Chain together multiple DataFrame modifications in one line (so long as it doesn’t make your code unreadable): df = df.apply(thing1).dropna()
  • As Roberto Bruno Martins pointed out, another way to ensure clean memory is to perform operations within functions. You can still unintentionally abuse memory this way, and explaining scope is outside the scope of this article, but if you aren’t familiar I’d encourage you to read this writeup.

Mistake 5: Manually Configuring Matplotlib

This might be the most common mistake, but it lands at #5 because it’s the least impactful. I see this mistake happen even in tutorials and blog posts from experienced professionals.

Matplotlib is automatically imported by Pandas, and it even sets some chart configuration up for you on every DataFrame.

There’s no need to import and configure it for every chart when it’s already baked into Pandas for you.

Here’s an example of doing it the wrong way, even though this is a basic chart it’s still a waste of code:

import matplotlib.pyplot as plt
ax.hist(x=df[‘x’])
ax.set_xlabel(‘label for column X’)
plt.show()

And here’s the right way:

df[‘x’].plot()

Easier, right? You can do anything on these DataFrame plot objects that you can do to any other Matplotlib plot object. For example:

df[‘x’].plot.hist(title=’Chart title’)

I’m sure I’m making other mistakes I don’t know about, but hopefully sharing these known ones with you will help put your hardware to better use, let you write less code, and get more done!

If you’re still looking for more optimizations, you’ll definitely want to read:

Rank Game Publisher
1 리니지M NCSOFT
2 리니지2M NCSOFT
3 바람의나라: 연 NEXON Company
4 R2M Webzen Inc.
5 기적의 검 4399 KOREA
6 뮤 아크엔젤 Webzen Inc.
7 KartRider Rush+ NEXON Company
8 V4 NEXON Company
9 일루전 커넥트 ChangYou
10 라그나로크 오리진 GRAVITY Co., Ltd.
11 라이즈 오브 킹덤즈 LilithGames
12 블레이드&소울 레볼루션 Netmarble
13 FIFA ONLINE 4 M by EA SPORTS™ NEXON Company
14 그랑삼국 YOUZU(SINGAPORE)PTE.LTD.
15 Epic Seven Smilegate Megaport
16 A3: 스틸얼라이브 Netmarble
17 AFK 아레나 LilithGames
18 리니지2 레볼루션 Netmarble
19 메이플스토리M NEXON Company
20 스테리테일 4399 KOREA
21 PUBG MOBILE PUBG CORPORATION
22 동방불패 모바일 Perfect World Korea
23 슬램덩크 DeNA HONG KONG LIMITED
24 Lords Mobile: Kingdom Wars IGG.COM
25 가디언 테일즈 Kakao Games Corp.
26 Roblox Roblox Corporation
27 Teamfight Tactics: League of Legends Strategy Game Riot Games, Inc
28 Gardenscapes Playrix
29 왕좌의게임:윈터이즈커밍 YOOZOO GAMES KOREA CO., LTD.
30 Pmang Poker : Casino Royal NEOWIZ corp
31 Brawl Stars Supercell
32 마구마구 2020 Netmarble
33 Age of Z Origins Camel Games Limited
34 Rise of Empires: Ice and Fire Long Tech Network Limited
35 검은사막 모바일 PEARL ABYSS
36 Empires & Puzzles: Epic Match 3 Small Giant Games
37 한게임 포커 NHN BIGFOOT
38 Summoners War Com2uS
39 FIFA Mobile NEXON Company
40 황제라 칭하라 Clicktouch Co., Ltd.
41 페이트/그랜드 오더 Netmarble
42 안녕엘라 (주)알피지리퍼블릭
43 케페우스M Ujoy Games
44 Homescapes Playrix
45 Random Dice: PvP Defense 111%
46 궁3D WISH INTERACTIVE TECHNOLOGY LIMITED
47 컴투스프로야구2020 Com2uS
48 Lord of Heroes CloverGames
49 Last Shelter: Survival Long Tech Network Limited
50 Cookie Run: OvenBreak - Endless Running Platformer Devsisters Corporation

Interactive spreadsheets in Jupyter

Image for post

ipywidgets plays an essential part in the Jupyter ecosystem; it brings interactivity between user and data.

Widgets are eventful Python objects that often have a visual representation in the Jupyter Notebook or JupyterLab: a button, a slider, a text input, a checkbox…

More than a library of interactive widgets, ipywidgets is a powerful framework upon which it is straightforward to create new custom widgets. Developers can quickly start their own widgets library with best practices of code structure and packaging using the widget-cookiecutter project.

You can find examples of really nice widgets libraries in the blog-post: Video streaming in the Jupyter Notebook.


A spreadsheet is an interactive tool for data analysis in a tabular form. It consists of cells and cell ranges. It supports value dependent cell formatting/styling and one can apply mathematical functions on cells and perform chained computations. It is the perfect user interface for statistical and financial operations.

The Jupyter Notebook was lacking a spreadsheet library, that’s when ipysheet comes into play.

ipysheet

ipysheet is a new interactive widgets library that aims at implementing the core features of a good spreadsheet application and more.

There are two main widgets in ipysheet, the Cell widget, and the Sheet widget. We provide helper functions for creating rows, columns and cell ranges in general.

The cell value can be a boolean, a numerical value, a string, a date, and of course another widget!

ipysheet uses a Matplotlib-like API for creating a sheet:

Image for post

The user can create entire rows, columns, and even cell ranges:

Image for post

Of course, values in cells are dynamic, the cell value can be dynamically updated from Python and the new value will be visible in the sheet.

It is possible to link a cell value to a widget (in the following screenshot a FloatSlider widget is linked to cell “a”) and to define a specific cell as the result of a custom calculation depending on other cells:

Image for post

Custom styling can be used, using what we call renderers:

Image for post

Adding support to NumPy Arrays and Pandas Dataframes loading and exporting was an important feature that we wanted. ipysheet provides from_array, to_array, from_dataframe and to_dataframe functions for this purpose:

Image for post
Image for post

Another killer feature is that a cell value can be ANY interactive widget. This means that the user can put a button or a slider widget in a cell:

Image for post

But it also means that a higher level widget can be put in a cell. Whether the widget is a plot from bqplot, a map from ipyleaflet or even a multi-volume rendering from ipyvolume:

Image for post

You can try it right now with binder, without the need of installing anything on your computer, just by clicking on this button:

Image for post

The source code is hosted on Github: https://github.com/QuantStack/ipysheet/

Similar projects

Acknowledgments

The development of ipysheet is led by QuantStack.

Image for post

This development is sponsored by Société Générale and Bloomberg.

About the Authors

Maarten Breddels is an entrepreneur and freelance developer / consultant / data scientist working mostly with Python, C++ and Javascript in the Jupyter ecosystem. Founder of vaex.io. His expertise ranges from fast numerical computation, API design, to 3d visualization. He has a Bachelor in ICT, a Master and PhD in Astronomy, likes to code and solve problems.


Martin Renou is a Scientific Software Engineer at QuantStack. Before joining QuantStack, he studied at the French Aerospace Engineering School SUPAERO. He also worked at Logilab in Paris and Enthought in Cambridge. As an open source developer at QuantStack, Martin worked on a variety of projects, from xsimd, xtensor, xframe, xeus and xeus-python in C++ to ipyleaflet and ipywebrtc in Python and JavaScript.

+ Recent posts