[LINUX] Use regular expressions in C

The functions used in regular expressions are defined as follows. Use these to perform a search using regular expressions.

#include<regex.h>
int regcomp(regex_t *preg, const char *regex, int cflags); 
int regexec(const regex_t *preg, const char *string, size_t nmatch,regmatch_t pmatch[], int eflags);
size_t regerror(int errcode, const regex_t *preg, char *errbuf,size_t errbuf_size);
void regfree(regex_t *preg);

step1 First, use regcomp to compile the regular expression string so that it can be used with regexec. Specify the regular expression string in regex, and the compiled one is stored in preg. cflag is a flag that affects regexec. The meaning is as shown in the table below.

cflag meaning
REG_EXTENDED Search with extended regular expressions.
REG_ICASE Ignore case.
REG_NOSUB Do not tell the matched position.
REG_NEWLINE Do not match newlines to operators that match all characters.

step2 Next, use regexec to search.

For preg, specify the character string that is the search range of the one compiled with regcomp. nmatch and pmatch [] are used to get the matched position. eflags are flags that affect regexec. The meaning is as shown in the table below.

eflag meaning
REG_NOTBOL The operator that matches the beginning of the line fails.
REG_NOTEOL The operator that matches the end of the line fails.

regexec returns REG_NOMATCH on failure. It also returns REG_NOMATCH if there is no matching string.

step3

Since regcomp is supposed to save the mallocked area in the regex_t structure, it is finally released with regfree to release it.

Error handling

regcomp is designed to return an error code on failure. Use regerror to stringify this error code.

Example

Let's make something like a grep command.

example.c



#include <stdio.h>
#include <stdlib.h>
#include <regex.h>

int main(int argc, char *argv[]) {
	if (argc != 3) {
		fprintf(stderr, "usage:\n%s PATTERN FILENAME\n", argv[0]);
		exit(1);
	}

	int err;
	char err_str_buf[4096] = {0};

	// STEP1
	regex_t patbuf;
	err = regcomp(&patbuf, argv[1], REG_EXTENDED | REG_NOSUB | REG_NEWLINE);
	if (err != 0) {
		regerror(err, &patbuf, err_str_buf, sizeof(err_str_buf));
		fprintf(stderr, "regcomp: %s\n", err_str_buf);
		exit(1);
	}

	FILE *fp;
	char buf[1024] = {0};
	fp = fopen(argv[2], "r");
	if (fp == NULL) {
		perror(argv[1]);
		exit(1);
	}
	//STEP2
	while(fgets(buf, sizeof(buf), fp)) {
		if (regexec(&patbuf, buf, 0, NULL, 0) == 0) {
			fputs(buf, stdout);
		}
	}
	//STEP3
	regfree(&patbuf);
	fclose(fp);
	return 0;
}

Execution result

$ gcc -o example_program example.c
$ ./example_program reg.* example.c
#include <regex.h>
    regex_t patbuf;
    err = regcomp(&patbuf, argv[1], REG_EXTENDED | REG_NOSUB | REG_NEWLINE);
        regerror(err, &patbuf, err_str_buf, sizeof(err_str_buf));
        fprintf(stderr, "regcomp: %s\n", err_str_buf);
        if (regexec(&patbuf, buf, 0, NULL, 0) == 0) {
    regfree(&patbuf);

reference

https://linuxjm.osdn.jp/html/LDP_man-pages/man3/regex.3.html

Recommended Posts

Use regular expressions in C
Use regular expressions in Python
Don't use \ d in Python 3 regular expressions!
How to use regular expressions in Python
When using regular expressions in Python
Overlapping regular expressions in Python and Java
[C] Use qsort ()
How to use Google Test in C
[Python] Regular Expressions Regular Expressions
Replace non-ASCII with regular expressions in Python
How to use the C library in Python
Pharmaceutical company researchers summarized regular expressions in Python
100 Language Processing Knock Regular Expressions Learned in Chapter 3
Remove extra strings in URLs with regular expressions
Regular expression in regex.h
Handle signals in C
Use config.ini in Python
Use DataFrame in Java
Use dates in Python
Access MongoDB in C
Next Python in C
Use ujson in requests
Use profiler in Python
Regular expression in Python
C API in Python 3
Regular expression in Python
Reasons to use long type in SQLite3 (C # Mono.Data.Sqlite)
Extend python in C ++ (Boost.NumPy)
Multiple regression expressions in Python
Let's use def in python
Machine language embedding in C language
Heapsort made in C language
Use Anaconda in pyenv environment
Use Measurement Protocol in Python
Use callback function in Python
Use parameter store in Python
Extract arbitrary strings using Python regular expressions / Use named groups
Use HTTP cache in Python
Use MongoDB ODM in Python
Use list-keyed dict in Python
Use Random Forest in Python
Imitated Python's Numpy in C #
Binary search in Python / C ++
AtCoder Regular Contest # 002 C Problem
How to create and use static / dynamic libraries in C
Use Spyder in Python IDE
Wrap long expressions in python
Extract numbers with regular expressions
Use Juman ++ in server mode
About Python and regular expressions
Minimum spanning tree in C #
ยท Address already in use solution
A memo that handles double-byte double quotes in Python regular expressions
Regular expressions that are easy and solid to learn in Python
Use <input type = "date"> in Flask
Use jinja2 template in excel file
Write a table-driven test in C
Multi-instance module test in C language
Use optinal type-like in Go language
Use fabric as is in python (fabric3)
3.6 Text Normalization 3.7 Regular Expressions for Tokenizing Text