[PYTHON] Path processing with takewhile and dropwhile

If you're looking for something, there's a method like find, you can't use it like filter, I was wondering when to use takewhile and dropwhile

But the other day I felt like I was able to use it well (?) So I'll throw it as a small story

Long introduction

Such a path is a character string

data/output/user/contract.csv

I want any part

Let the part (let's call it base) and the part of ʻuser (let's call it dir) up to ʻoutput.

For the time being, it worked with this implementation

def base = path.split('/')[0..1].join('/') // data/output
def dir = path.split('/')[2]               // user

I want to cut more directories!

data/ver1.0/output/user/contract.csv

Hmm ... I've shifted the subscripts ...

def base = path.split('/')[0..2].join('/') // data/ver1.0/output
def dir = path.split('/')[3]               // user

Let's write the test code for the tool!

This is the dummy data used in the test!

test/data/ver1.0/output/user/contract.csv

i ... if should be used ... haha ...

if (isTest) {
  def base = path.split('/')[0..3].join('/') // test/data/ver1.0/output
  def dir = path.split('/')[4]               // user

} else {
  def base = path.split('/')[0..2].join('/') // data/ver1.0/output
  def dir = path.split('/')[3]               // user
}

No matter how much this implementation is

It's hard to understand with subscript access, and I can't stand changes in the directory structure at all. What's more, changing the subscript with a flag is a sight!

So takewhile / dropwhile!

If you want up to ... / output, take while, If you want the next `after output, dropwhile was fine!

def base = path.split('/').takeWhile {it != 'output'}.join('/').concat('/output') // data/ver1.0/output
def dir = path.split('/').dropWhile {it != 'output'}[1]                           // user

If this is the case, it can be flexibly handled automatically to some extent (although the configuration below ʻoutputdoes not change due to tool design reasons). Even so, I actually wrotetakewhile for the first time, but it doesn't include ʻoutput ... I'm a little disappointed there ...

bonus

Groovy Easy to write, as in the example The method chain is easy to read, and it's easy to write anonymous functions.

Python

path = 'data/ver1.0/output/user/contract.csv'

import itertools

taken_iter = itertools.takewhile(lambda x: x != 'output', path.split('/'))
print '/'.join(list(taken_iter)) + '/output' # data/ver1.0/output

dropped_iter = itertools.dropwhile(lambda x: x != 'output', path.split('/'))
print list(dropped_iter)[1]                  # user

I like Python, but I'm a little disappointed ... Since xxxwhile imports and calls the method, you have to apply list () or join () to the result. If you write it in one line, it will be ....)))) Anonymous functions are also a little harder to read than Groovy and Scala, and it's hard to understand what they're doing overall, so I'm sorry!

Haskell

import Data.List
import Data.List.Split

main = do
    let path = "data/ver1.0/output/user/contract.csv"

    print $ (intercalate "/"  $ (takeWhile (/= "output") $ splitOn "/" path)) ++ "/output"
    print $ (dropWhile (/= "output") $ splitOn "/" path) !! 1

Because there are many things to do, there will be a lot of (...) ... Also, it has nothing to do with the main subject, but why not use it without installing splitOn! I use it quite often!

Haskell2

import Data.List
import Data.List.Split

main = do
    let path = "data/ver1.0/output/user/contract.csv"

    let reversed = reverse $ splitOn "/" path
    print $ intercalate "/" $ reverse $ dropWhile (/= "output") reversed
    print $ last $ takeWhile (/= "output") $ reversed

If it is takeWhile, the essential ʻoutput` is not included in the result, so I tried reversing the elements Is it refreshing?

I also thought about PHP, but it seemed that there was no noticeable difference, so I omitted it, Java seems to be troublesome (prejudice), so I rejected it It's good to know various functions by writing in various languages!

Why does it always get longer with just this small story ... But it's not good to convey convenience!

Also, there is no opinion that "path as a character string" should be divided by "output"!