In the summer of 2015, I joined an AI startup. It was a decent development for the first time in 5 years.
I'm using Python Django (and later Python's Google App Engine). It's a framework very similar to Rails. I touched Rails about 7 years ago. So I was able to keep up with the idea of the framework.
However, the code actually in operation was terrible. And the code of the server engineer who joined the company about a month ago contained the spell of Medapani.
Below is a list of the symptoms that I noticed.
I was confused.
I thought about the cause.
I can't design and analyze in the first place
――There is no document at all saying "because it will change later" --Thoughts that are unfriendly to the person who takes over --No system basic design ――I wanted at least batch data flow and major screen transitions. --Remarks "You can understand by reading the code" in the first stage of shit engineer certification --Even if there is a design document, the granularity is too coarse --I have never seen UML ――The important thing on the server side is table design and data flow, but I just say "what is it delicious?" --Tables, classes, methods, and variable naming conventions are too common sense and painful --Please stop using Set or List in the name of the Model class. --The method name doesn't start with a verb
Don't know enterprise application architecture patterns
--There was a fucking engineer who wanted to know the architecture only with Rails ――No, I know that Rails has a good engineer. .. .. --At a minimum, if you don't know about microservices architecture or Domain Driven Design, the architecture implementation is tough. --There was AI code, just copy and paste the intern's code and paste it into the utils file. --If it's an app or service released by a company, you'll have to review it. .. ..
The manager is a sales person and is too ignorant about IT product making and manufacturing. One of the startup anti-patterns is for business owners to start without a good understanding of the tech business.
--Too amateur in product management ――When there is no CTO / product manager from the time of establishment (CTO has entered and is improving while struggling) --User interviews were assigned to people without skills --The CEO is a sales person. Even though I don't write the code, I don't go to interviews, get media coverage, and take the stage at events. ――I had an interview only once, but I explained and persuaded him. No, the purpose would be to listen to and dig deeper into the problems users are having. .. .. ――I think it may be necessary to raise funds, but the most convincing thing is not the presentation material, but the product. What would you do without doing activities to improve the quality? .. .. --COO is a sales system. I made it on the premise of improving a product, but I'm trying to scale it. --Among the problem hypothesis, solution hypothesis, product hypothesis, and scale hypothesis, I am trying to scale by skipping various steps and verifications -Opinions based on Entrepreneur's textbook and Inspired Is confused with high-speed PDCA --Too amateur project management ――When you hear that the development period is one month, there are three weeks left. ――When I thought that I would finish another task and start development in the remaining two weeks, there was a customer presentation. One week left. ――Three days ago, I learned that the final day of development was the day of customer presentation at the company-wide MTG. It is possible to include milestones in the development period. .. .. Development schedule, 2 days left ahead of schedule. ――When I pointed out that in the company-wide MTG, I was told that "the server side only looks at the DB". I haven't received the AI code at that time, and I don't know what will happen, including the data structure. ――In the end, I made it in time by staying up all night with the connection with AI and the GCE environment. --Product quality is neglected ――Delivery time priority, release for the time being --Even if you tell the man-hours, the release date will be advanced due to bargaining power. ――For weekend work and recovery all night ――When you get sick, you are told that you are not working well and that you are not giving value. --Specifications are determined by internal bargaining power ――Even if you decide on measures to improve the process, you have never been protected
Cause 1 tried a study session on the ICONIX process. Click here for the reference book.
Let's talk about the solution for cause 2.
It's misleading, but the Active Record pattern didn't start with Rails. It is described in Fowler's Enterprise Application Architecture Pattern.
I understand that the View and DB tables are 1: 1 and the View: Model: Table is 1: 1: 1 if you include the Model in between. It is adopted when simple configuration, simple service, and complicated functions are not added.
However, as the system becomes more complex and the tables and models grow, it becomes difficult. It's easy to get messed up.
Although it overlaps with the above symptoms, I think the problem can be classified into several patterns. The contents corresponding to the problem pattern are described below.
The view and batch interfaces have only three things to do.
--Validation of request and argument parameters --In some cases, input validators were created and transferred. ――But there are too many to fix. .. .. --request ・ Parse the argument parameters and transfer the process to another layer --For views, repack the returned object into response
We use the term service-oriented architecture here. Such a common guy.
Changed to execute service layer methods from views and batch interfaces.
In Domain Driven Design, this is the application controller. Note that this is not what Domain Driven Design calls a service (see Evans Classification).
There is also a web presentation pattern in the enterprise application architecture pattern, which also has an application controller.
Oh, it's confusing.
Provides a coarse-grained API for views. The naming convention is divided into the following two patterns.
--CRUD: XxxService with only one model (table) --Combine multiple models (tables) CRUD: YyyEngine
It's also useless to instantiate each time you call from a view. Recipe to singleton with class decorator was adopted.
When JOIN is faster in batch processing, it can be freely used in a model extended with a combined QuerySet. Describe the case where the Foreign Key is affixed.
Sometimes I created a model with additional information from item later.
Treat the JOIN data as if it were a model.
It looks like this.
models/item.py
class Item(models.Model):
name = models.CharField(max_length=128, blank=True)
description = models.CharField(max_length=1024, blank=True)
image = mdoels.ImageField(upload_to='hoge', blank=True, null=True)
models/item_feature_1.py
class ItemFeature1(models.Model):
item = models.ForeignKey(Item)
feature_1 = models.TextField()
models/item_extra_info.py
class ItemExtraInfo(models.Model):
item = models.ForeignKey(Item)
info = models.TextField()
QuerySet for joining looks like this. Others are not considered on the premise of joining with select.
joined_query_set.py
class JoinedItemFeatureQuerySet(models.QuerySet):
def __iter__(self):
queryset = self.values(
'id', 'name', 'description', 'image',
'itemfeature1__feature_1',
'itemextrainfo__info')
results = list(queryset.iterator())
instances = []
for result in results:
item_feature = self.model(
result['id'],
result['name'],
result['description'],
result['image']
)
item_feature.feature_1 = result['itemfeature1__feature_1'] #Add / pack model properties
item_feature.info = result['itemextrainfo__info']
instances.append(item_feature)
return iter(instances) #Repack to iterator protocol
self._fetch_all ()
and pack it into self._result_cache
.custom_manager.py
class JoinedItemFeatureManager(models.Manager):
def get_queryset(self):
queryset = JoinedItemFeatureQuerySet(self.model, using=self._db)
queryset = queryset.filter(del_flg=False, itemfeature1__isnull=False, itemextrainfo__isnull=False) #Don't be null
return queryset
joined_item_domain.py
class JoinedItemFeatureDomain(Item):
objects = JoinedItemFeatureManager()
class Meta:
proxy = True #Do not create a table.
You can use the data freely with joined_item_features = JoinedItemFeatureDomain.objects.filter (...)
. .. .. Haz.
It can be cut out as a strategy or reused as a calculation-only class.
――For things that can be reused later or that can be made into a company's own library --The size of the trained model of the machine learning system is large. I don't want to load it many times, so I made it a singleton. --Variable memory in the middle of calculation may have remained --In some cases, only the parameters predicted by pickle.dump were saved.
In Django, there is a method called proxy model, and I set it in a subclass.
Below is an excerpt of the code for my personal project.
intro/models/abstract_model.py
from django.contrib.auth.models import User
from django.db import models
class AbstractModel(models.Model):
registered_at = models.DateTimeField(auto_created=True)
registered_by = models.ForeignKey(User, related_name='%(class)s_registered_by')
updated_at = models.DateTimeField(auto_now_add=True)
updated_by = models.ForeignKey(User, related_name='%(class)s_updated_by')
class Meta:
abstract = True
intro/models/article.py
from django.contrib.auth.models import User
from intro.models.abstract_model import AbstractModel
class Article(AbstractModel):
title = models.CharField(max_length=128)
description = models.CharField(max_length=2048)
author = models.ForeignKey(Author) #The description of the Author class is omitted.
categories = models.ForeignKey(Category, null=True, blank=True) #Category Class description omitted
url = models.URLField()
intro/models/article_domain.py
from intro.models.article import Article
from intro.consts import DEFAULT_MECAB_PATH
class ArticleDomain(Article):
class Meta:
proxy = True
def __init__(self, *args, **kwargs):
# Do initialize...
def __call__(self, mecab_path=None):
if not mecab_path:
self.mecab_path = DEFAULT_MECAB_PATH
def parse_morpheme(self):
# Do morpheme analysis
@classmethod
def train(cls, filter_params)
authors = filter_params.get('author')
articles = None
if authors and articles:
articles = Articles.objects.filter(authors__in=authors)
# Do some extraction
# Do training
def predict(self, texts)
# Do prediction
Create unit tests with from django.test import TestCase
while refactoring
Obviously, the outlook for the code has improved a lot.
Changed the data extraction condition in get_queryset in CustomManager class so that CustomManager is held in proxy class.
It worked well with a domain model that was complicated for batch.
Writing an email sending implementation in the model or drawing urllib2 solidly for linking to external services is awkward.
Created a Gateway class for external cooperation (written as adapter in the above figure), and inherited and used the Gateway class in the domain model class or model class (written as Application in the above figure). ..
Model classes usually use convenience methods inherited from superclasses. Following that, we inherited the Gateway class and used convenient methods.
Regarding cause 3, I decided to give up on the management and change jobs (although only the CTO is decent).
If I took the time, I might have been able to improve the company. However, considering the time it takes for people to change, the speed at which technology advances, and the conditions from other companies, I thought that if I remained, I would waste my time and life. I'm not young enough to tolerate such waste. I'm not young enough to deal with the memories of others who are lonely.
Not only startups, but management recognized it as super important.
Recommended Posts