Finisky Garden

NLP, Software Engineering, Product Design

0%

Lightsquid is a handy log analyzer for squid proxy. However, the project is not maintained since 2009.

Today, I found lightsquid doesn’t work in 2021. Why?

Lightsquid Error

When I run lightparser, the output looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
$ ./lightparser.pl access.log.66666
>>> use file :: /var/log/squid/access.log.66666
run TIME: 1 sec
LightSquid parser statistic report

                 3312 lines processed (average 3312.00 lines per second)
                    0 lines parsed
                    0 lines recovered
                    0 lines notrecovered
                    0 lines skiped by bad year
                 3312 lines skiped by date filter
                    0 lines skiped by Denied filter
                    0 lines skiped by skipURL filter

WARNING !!!!, parsed 0 lines from total : 3312
please check confiuration !!!!
may be wrong log format selected ?

Seems that all lines are filtered by the date filter. I double checked the log format and make sure it is correct and the same with previous days.

MongoDB sharded cluster is the most complicated architecture. The deployment of sharded cluster in Kubernetes is relatively hard. We will go through the deployment process by MongoDB Ops Manager in this post.

Before start, please go through the Create a UserDB ReplicaSet first.

A MongoDB sharded cluster consists of the following components:

  • shard : Each shard contains a subset of the sharded data. Each shard can be deployed as a replica set .
  • mongos : The mongos acts as a query router, providing an interface between client applications and the sharded cluster.
  • config servers : Config servers store metadata and configuration settings for the cluster.

In this post, we are going to create a sharded cluster with 2 shards (3 instances replica set), 2 mongos and 3 config servers.

This is part5, we will use the generated certficates to enable user database TLS and AUTH.

MongoDB Ops Manager Series:

  1. Install MongoDB Ops Manager
  2. Create a UserDB ReplicaSet
  3. Expose UserDB to Public
  4. Openssl Generates Self-signed Certificates
  5. Enable UserDB TLS and Auth

Understanding Different Secure Connections

Before we start, look at the following three TLS:

This is part4, we will create a self-signed CA certificate and three server certificates.

MongoDB Ops Manager Series:

  1. Install MongoDB Ops Manager
  2. Create a UserDB ReplicaSet
  3. Expose UserDB to Public
  4. Openssl Generates Self-signed Certificates
  5. Enable UserDB TLS and Auth

Self-signed certificates is not recommended for production. It cannot prevent man-in-the-middle attack. Since our main purpose is to encrypt the communication messages instead of authentication. Self-signed certificates is acceptable.

This is part3, we will expose the user database pods to the public so that Mongo client is able to access it.

MongoDB Ops Manager Series:

  1. Install MongoDB Ops Manager
  2. Create a UserDB ReplicaSet
  3. Expose UserDB to Public
  4. Openssl Generates Self-signed Certificates
  5. Enable UserDB TLS and Auth

So far, the user database can be accessed only inside the kubernetes cluster. The official blog’s approach is to expose the pods by NodePort: # Connect to a MongoDB Database Resource from Outside Kubernetes

This is part2, we will create a user database that is a 3 instances ReplicaSet.

MongoDB Ops Manager Series:

  1. Install MongoDB Ops Manager
  2. Create a UserDB ReplicaSet
  3. Expose UserDB to Public
  4. Openssl Generates Self-signed Certificates
  5. Enable UserDB TLS and Auth

The so called Application Database is the backend DB of Ops Manager. It cannot be used to store user data. The user database is called MongoDB Deployment. Note that the deployment is different with Kubernetes deployment.

It’s pretty easy to configure a MongoDB standalone instance (almost zero configuration). However, if you want to run a production-level MongoDB cluster, the configuration process is non-trivial. For a production cluster, replication/sharding/dyanmic scaling/backup/transport encryption/monitoring are required. Is there a nice tool to help us?

MongoDB cluster is a distributed system, which is well suited to run in Kubernetes. However, the collaboration of MongoDB instances usually need to manually run commands on each instance which is independent of Kubernetes. Therefore, MongoDB Enterprise Kubernetes Operator is developed to mitigate the gap. Morever, MongoDB Ops Manager is a great web portal to help these automation tasks.

When you setup TLS/SSL for MongoDB Configure mongod and mongos for TLS/SSL , you might encounter the following errors:

1
2
{"t":{"$date":"2020-11-30T08:02:19.406+00:00"},"s":"E",  "c":"NETWORK",  "id":23248,   "ctx":"main","msg":"Cannot read certificate file","attr":{"keyFile":"/etc/ssl/testserver1.pem","error":"error:0200100D:system library:fopen:Permission denied"}}
{"t":{"$date":"2020-11-30T08:02:19.406+00:00"},"s":"F",  "c":"CONTROL",  "id":20574,   "ctx":"main","msg":"Error during global initialization","attr":{"error":{"code":140,"codeName":"InvalidSSLConfiguration","errmsg":"Can not set up PEM key file."}}}

or

1
{"t":{"$date":"2020-11-30T08:01:14.545+00:00"},"s":"I",  "c":"ACCESS",   "id":20254,   "ctx":"main","msg":"Read security file failed","attr":{"error":{"code":30,"codeName":"InvalidPath","errmsg":"permissions on / are too open"}}}

So what’s the right ownership and permission for the certificate pem file? The answer is: the pem file should have read access but no write access for the user mongodb.

Monitoring mysql server metrics is crucial for a DBA. Typically, we can simply monitor the recent server status summary through mysqlbench. But what’s the meaning for these metrics? Some of them are self-explained such as connections and traffic while others are not. For example, what’s the difference between Selects per second and Innodb reads per second? How to measure the write performance?

The following figure illustrates the serve status: mysqlbench server status

Master-slave replication is widely used in production. Monitoring the replication lag is a common and critical task. Typically, we are able to get the real-time difference between the master and the slave by periodically checking the Seconds_Behind_Master variable.

According to link

Seconds_Behind_Master: this field shows an approximation for difference between the current timestamp on the slave against the timestamp on the master for the event currently being processed on the slave.

We developed a REST API long time ago and recently found that the released client has a flaw: the HttpRequestMessage is missing content-type application/json. In earlier version we manually deserialize the request json by reading the request body but now we leverage AspNetCore framework to automatically get the request structure from the API parameter. However, the legacy client will not work: HTTP 415 “Unsupported Media Type” error happens.

Therefore, for backward compatibility, we need to make the server treat all coming requests as application/json even it is a plain text (by default).

Someone may argue that there is no reason to disable ipv6 on Linux. For me, the reason is that in some ipv6-enabled website, I frequently be classified as a bot who need to solve captcha which is really annoying :-( . Consider the ipv6 address is never change (assigned by service provider), disable ipv6 is the simplest solution.

Let’s come to the solution. Just use your favorite editor to add one line in /etc/sysctl.conf:

When we use entity framework to manipulate SQL database, we use where to query and include to load related entities (join operation).

Sample database table schema: Employee: Id, Name, Age Salary: Id, BasePay

Typical query scenarios:

  • Query employee 3’s name => Employee.Where(x => x.Id == 3)
  • Query employee Jack’s age => Employee.Where(x => x.Name == “Jack”)
  • Query employee Jack’s basepay => Employee.Where(x => x.Name == “Jack”).Include(x => x.Salary) …

To make the code clean and focus, the following examples will not include dbContext creation logic, we assume db = DbContext() which contains the Employee table.

Install an old library lightsquid on Ubuntu 20.04. When visit the cgi, internal server error pops up.

Debug the cgi by directly run it in /var/www/lightsquid:

1
2
3
/var/www/lightsquid$ perl index.cgi
Can't locate CGI.pm in @INC (you may need to install the CGI module) (@INC contains: /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.30.0 /usr/local/share/perl/5.30.0 /usr/lib/x86_64-linux-gnu/perl5/5.30 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.30 /usr/share/perl/5.30 /usr/local/lib/site_perl /usr/lib/x86_64-linux-gnu/perl-base) at index.cgi line 19.
BEGIN failed--compilation aborted at index.cgi line 19.

Seems that some modules are missing, after several search: https://packages.ubuntu.com/search?suite=trusty&arch=any&mode=filename&searchon=contents&keywords=cgi.pm

View the generated SQL query by entity framework is important. The translated SQL query may not what you expected. Sometimes it leads to significant performance issue.

Print the generated query for an IQueryable<T> in EF Core is different with EF. Finally I found a worked solution which is listed below.

The following code is tested in MySql.Data.EntityFrameworkCore 8.0.16. Just put the following class into your project:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
public static class QueryableExtensions
{
    private static readonly TypeInfo QueryCompilerTypeInfo = typeof(QueryCompiler).GetTypeInfo();
    private static readonly FieldInfo QueryCompilerField = typeof(EntityQueryProvider).GetTypeInfo().DeclaredFields.First(x => x.Name == "_queryCompiler");
    private static readonly FieldInfo QueryModelGeneratorField = typeof(QueryCompiler).GetTypeInfo().DeclaredFields.First(x => x.Name == "_queryModelGenerator");
    private static readonly FieldInfo DataBaseField = QueryCompilerTypeInfo.DeclaredFields.Single(x => x.Name == "_database");
    private static readonly PropertyInfo DatabaseDependenciesField = typeof(Database).GetTypeInfo().DeclaredProperties.Single(x => x.Name == "Dependencies");

    public static string ToSql<TEntity>(this IQueryable<TEntity> query)
    {
        var queryCompiler = (QueryCompiler)QueryCompilerField.GetValue(query.Provider);
        var queryModelGenerator = (QueryModelGenerator)QueryModelGeneratorField.GetValue(queryCompiler);
        var queryModel = queryModelGenerator.ParseQuery(query.Expression);
        var database = DataBaseField.GetValue(queryCompiler);
        var databaseDependencies = (DatabaseDependencies)DatabaseDependenciesField.GetValue(database);
        var queryCompilationContext = databaseDependencies.QueryCompilationContextFactory.Create(false);
        var modelVisitor = (RelationalQueryModelVisitor)queryCompilationContext.CreateQueryModelVisitor();
        modelVisitor.CreateQueryExecutor<TEntity>(queryModel);
        var sql = modelVisitor.Queries.First().ToString();

        return sql;
    }
}

The usage is straightforward, just append a .ToSql() after your IQueryable<T>:

Today I found that some pods in kubernetes cluster are failed, the status is Waiting: ContainerCreating. The pod events:

1
2
3
4
MountVolume.SetUp failed for volume "xxxxx" : secret "xxxxx" not found
kubelet aks-agentpool-xxx-vmss000001

Unable to attach or mount volumes: unmounted volumes=[xxxxx], unattached volumes=[xxxxx]: timed out waiting for the condition

I remember that about one week ago I delete some secretes in this cluster. Therefore, the problem becomes how to recover the deleted secret “xxxxx”?

Recently, I found there were huge amount of files named .com.google.Chrome.* be created in my /tmp folder. Obviously the culprit is Chrome. However, after some research no solution is found to prevent Chrome creating these garbage.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
/tmp$ du -csh .com.google.Chrome.*

8.0K    .com.google.Chrome.00OKwD
104K    .com.google.Chrome.013jYf
172K    .com.google.Chrome.015x5t
...
48K     .com.google.Chrome.Zytrhf
16K     .com.google.Chrome.zz233G
36K     .com.google.Chrome.ZzrsZY
163M    total
/tmp$  find /tmp -name ".com.google.Chrome*" -ls| wc -l
3468

Update 2020/12/12 I found there are lots of .com.google.Chrome.* files in /tmp/snap.chromium/tmp :-( . Look at the ncdu results:

Yesterday, when I login to my VM I found that the disk is full. Even a tab command completion cannot be done. I immediately delete some unused files and the system works. However, when I reboot the system, it stucked at A start job is running for Create Volatile Files and Directories...:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
...
[  OK  ] Started File System Check on /dev/disk/cloud/azure_resource-part1.
[  OK  ] Started File System Check Daemon to report status.
[  OK  ] Started File System Check on /dev/d…5d996-7436-4ec8-b5d6-0d7b6100aeb5.
         Mounting /data...
[  OK  ] Mounted /data.
[  OK  ] Reached target Local File Systems.
         Starting AppArmor initialization...
         Starting ebtables ruleset management...
         Starting Tell Plymouth To Write Out Runtime Data...
         Starting Set console font and keymap...
[  OK  ] Started Set console font and keymap.
[  OK  ] Started Tell Plymouth To Write Out Runtime Data.
[  OK  ] Started ebtables ruleset management.
[  OK  ] Started Flush Journal to Persistent Storage.
         Starting Create Volatile Files and Directories...
[  OK  ] Started AppArmor initialization.
[    **] A start job is running for Create V… Directories (13min 46s / no limit)

I searched and found a similar discussion: # Boot stuck at “A start job is running for Create Volatile Files and Directories”

After I changed the HOME folder to another place, I copied the ssh config folder from old HOME to the new place. Supposedly it should directly work, right? However, when I login the server with my private key, the server said: “Server Refused Our Key”…

I spent some time to figure out the problem: new HOME folder access mode issue, it SHOULD NOT have write access for group.

Unity supports three ways of registering types:

  • Instance registration
  • Type registration
  • Factory registration

Typically, Instance registration and Type registration resolve dependencies through ResolvedParameter<T>, while Factory registration resolve dependencies by a factory delegate. In practice, when you want to resolve a List<T> or Dictionary<T1, T2>, Factory registration is what you want. Let’s go through how to resolve a collection of customized classes.

Animal Example

Start with example, assume we have an IAnimal interface which has two implementations. There is a Zoo class which accept List<IAnimal> as the parameter: