A note for writing Python-like code

Introduction

I've organized the rewrites that I often used after writing python.

namedtuple

I think I use class when I want to reuse some structure. However, I didn't like it very much because the code would be long.

class Twitter:
	def __init__(self, account, user, followers, followings, tweets):
		self.account = account
		self.user = user
		self.followers = followers
		self.followings = followings
		self.tweets = tweets

	def __repr__(self):
		return f"{type(self).__name__}(account={repr(self.account)}, user={repr(self.user)}, followers={repr(self.followers)}, followings={repr(self.followings)}, tweets={repr(self.tweets)})"

t = Twitter("Yuriko Koike", "@ecoyuri", 790000, 596, 3979)
print(t)
Twitter(account='Yuriko Koike', user='@ecoyuri', followers=790000, followings=596, tweets=3979)

I could write this using namedtuple: ↓

from collections import namedtuple

Twitter = namedtuple('Twitter', 'account user followers followings tweets')
t = Twitter('Yuriko Koike', '@ecoyuri', 790000, 596, 3979)
print(t)
#Output is the same as above

So I thought it was very good.

yield

For example, consider the following code.

def omit_stopwords(tweets):
	omitted_tweets = []
	for t in tweets:
		#url or@{User name}Or#{Tag name}Remove
		reg = r'https?://[\w/:%#\$&\?\(\)~\.=\+\-]+|[@@][A-Za-z0-9._-]+|[##][one-龥_Ah-Hmm_A-ヺ a-zA-Za-zA-Z0-9]+'
		text_mod = re.sub(reg,'',t['text'])
		omitted_tweets.append(text_mod)
	return omitted_tweets

# get_tweets"[{'text':{tweet1},'text':{tweet2},...,'text':{tweetN}]Function that returns data in the format of"
ots = omit_stopwords(get_tweets())

for ot in ots:
	print(f"analyzing the tweet: {ot}")
analyzing the tweet:Today 18:Live streaming from 45 onwards will be accompanied by Governor Yoshimura of Osaka Prefecture. ~~~
・ ・ ・
analyzing the tweet:~~~. We will continue to conduct field surveys to prevent the spread of infection.

Tweet data etc. are usually large, so omitted_tweets is a fairly large list, which is not good in terms of memory and speed. At such times

def omit_stopwords(tweets):
	for t in tweets:
		reg = r'https?://[\w/:%#\$&\?\(\)~\.=\+\-]+|[@@][A-Za-z0-9._-]+|[##][one-龥_Ah-Hmm_A-ヺ a-zA-Za-zA-Z0-9]+'
		text_mod = re.sub(reg,'',t['text'])
		yield text_mod

ots = omit_stopwords(get_tweets())

for ot in ots:
	print(f"analyzing the tweet: {ot}")

By using yield instead of` return as in, the replacement process in omit_stopwords is executed for the first time in the for statement, and as a result, the memory is suppressed. Seems to be able to. As proof of that, if you try to output the variable ots,

<generator object omit_stopwords_yield at 0x10f957468>

It is a generator type like

print(f"analyzing the tweet: {ots.__next__()}")
print(f"analyzing the tweet: {ots.__next__()}")
print(f"analyzing the tweet: {ots.__next__()}")
#・ ・ ・

You can output the data in the list one by one with. (After turning the for statement, an error will occur because the generator is used up.)

Comprehension notation

reg = r'[@@][A-Za-z0-9._-]+'
target_tweets = []
# @{User name}Extract only tweets that do not contain
for t in get_tweets():
	if not re.search(reg, t['text']):
		target_tweets.append(t)

↑ is

reg = r'[@@][A-Za-z0-9._-]+'
target_tweets = [t for t in get_tweets() if not re.search(reg, t['text'])]

I will put it refreshingly like. Often used when you want to create another list from a list.

So I want to use it positively.

in conclusion

Others may be added as appropriate.

Recommended Posts

A note for writing Python-like code
A tool for easily entering Python code
Write python-like code
Just a note
Write Python-like code (dictionary)
Write about building a Python environment for writing Qiita Qiita
Create a QR code for the URL on Linux
Before writing Python code
A memorandum when writing experimental code ~ Logging in python
Techniques for code testing?
A note about subprocess
A note on speeding up Python code with Numba
A note about mprotect (2)
Notes on writing config files for Python Note: configparser
A note for embedding the scripting language in a bash script
Note 2 for embedding the scripting language in a bash script
Python code for writing CSV data to DSX object storage
[Python] Create a screen for HTTP status code 403/404/500 with Django
Python code memo for yourself
A note about KornShell (ksh)
[Note] CAD query sample code
A note about TensorFlow Introduction
3 months note for starting Python
Made a command for FizzBuzz
[Python] Sample code for Python grammar
A small note following printf
[Note] [For myself] Django command
A note about [python] __debug__
Create a Python environment for professionals in VS Code on Windows
Python> I made a test code for my own external file