0%

解决vscode找不到python自定义包

  • 如果提示找不到当前项目的xxx module,可以将当前项目或者xxx的路径加入到pythonpath环境变量中

  • 在.vscode的settings.json中, 添加以下内容, 解决terminal运行python程序时找不到自定义模块的问题

1
2
3
4
"python.autoComplete.extraPaths": ["${workspaceFolder}", "/path/to/your/module"],
"terminal.integrated.env.windows": {
"PYTHONPATH": "${workspaceFolder};${env:PYTHONPATH};"
}
  • 在.vscode的launch.json中, 添加以下内容, 解决调试模式找不到自定义模块的问题
1
"env": {"PYTHONPATH":"${workspaceFolder};${env:PYTHONPATH}"}

说明

python.autoComplete.extraPaths指定vscode的python插件寻找其他包的位置, terminal.integrated.env.windows指定terminal的pythonpath环境变量, pycharm不会报找不到自定义包的原因是pycharm会自动把项目路径添加到pythonpath环境变量中, vscode则需要手动添加

解决Windows Tomcat的乱码问题

常规解决方法

Tomcat9的log默认编码格式为utf-8, 而windows的cmd默认编码格式为gbk, 因此出现乱码问题
查阅网上资料, 解决方法主要有3种:

  • 修改cmd默认编码格式 (经测试无效)
  1. ctrl+r打开运行, 输入regedit
  2. 定位到HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Command Processor
  3. 添加字符串值, 名称为autorun, 值为chcp 65001
  • 修改tomcat控制台窗口的默认编码 (经测试有效)
  1. ctrl+r打开运行, 输入regedit
  2. 定位到HKEY_CURRENT_USER\Console\Tomcat, 若没有则新建项
  3. 添加DWORD (32位)值, 名称为CodePage, 选择十进制, 值为65001
  • 修改tomcat/conf/logging.properties (经测试有效, 但可能导致get/post调试参数出现乱码)
  1. 设置java.util.logging.ConsoleHandler.encoding = GBK

设置tomcat默认语言为英语

如果不需要中文提示, 也可以将tomcat默认语言改为英语, 方法如下:

  1. 打开tomcat/conf/bin/catalina.bat
  2. 在setlocal这行代码之后添加rem set “CATALINA_OPTS=%CATALINA_OPTS% -Duser.language=en”

IDEA控制台中文乱码

虽然tomcat启动时没有出现乱码, 但自己的代码通过idea控制台输出的中文依然可能出现乱码, 解决方法:

  1. 点击Edit Configurations, 打开tomcat配置界面
  2. 在vm options中添加-Dfile.encoding=UTF-8

mysql 小技巧

varchar类型的时间如何进行比较

使用str_to_date(str,format)函数, 该函数返回一个datetime值

假设有一张user表:

id username login_time
1 aaa 2021-12-03
2 bbb 2020-11-22
3 ccc 2019-10-01
4 ddd 2018-04-28
1
2
3
4
# 方法一
select user,login_time from user where str_to_date(login_time,'%Y-%m-%d') between '2020-01-01' and '2020-12-31';
# 方法二
select user,login_time from user where str_to_date(login_time,'%Y-%m-%d') between str_to_date('2020-01-01','%Y-%m-%d') and str_to_date('2020-12-31','%Y-%m-%d');

使用dbeaver数据库等工具时执行任务卡在0%

  • 远程登录mysql服务器并切断所有远程连接
    1
    2
    3
    4
    5
    6
    # 列出所有连接
    mysql> show full processlist;
    # 生成kill语句
    mysql> select concat('kill ', id, ';') from information_schema.processlist where command != 'Sleep' and Host!='localhost';
    # kill 对应进程
    mysql> kill pid

pymysql

异常

  1. pymysql.err.interfaceerror: (0, '')
  • 使用pymysql.Connection.ping()方法, 在执行sql之前测试连接并自动重连

    1
    2
    3
    4
    5
    conn=pymysql.connect(...)
    conn.ping(reconnect=True)
    cursor = connection.cursor()
    cursor.execute(query)
    cursor.close()

    数据库连接池

  • 引入需要的包

    1
    2
    3
    4
    5
    6
    7
    import os
    import pymysql
    from dbutils.pooled_db import PooledDB
    from pymysql.cursors import DictCursor
    import configparser
    import logging
    logging.basicConfig(format="%(levelname)s:%(message)s", level=logging.WARNING)
  • Config 封装配置文件解析器

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    class Config(configparser.ConfigParser):
    """
    Config().get_content("MYSQL")
    配置文件里面的参数
    [MYSQL]
    HOST = localhost
    PORT = 3306
    USER = root
    PASSWORD = 123456
    CHARSET = utf8
    """

    def __init__(self, config_filepath):
    super(Config, self).__init__()
    self.read(config_filepath)

    def get_sections(self):
    return self.sections()

    def get_options(self, section):
    return self.options(section)

    def get_content(self, section):
    result = {}
    for option in self.get_options(section):
    value = self.get(section, option)
    result[option] = int(value) if value.isdigit() else value
    return result

    def optionxform(self, optionstr):
    # 重写optionxform()防止key被自动转换成小写
    return optionstr
  • DBUtil 数据库连接池工具类

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    91
    92
    93
    94
    95
    96
    97
    98
    99
    100
    101
    102
    103
    104
    105
    106
    107
    108
    109
    110
    111
    112
    113
    114
    115
    116
    117
    118
    119
    120
    121
    122
    123
    124
    125
    126
    127
    128
    129
    130
    131
    132
    133
    134
    135
    136
    137
    138
    139
    140
    141
    142
    143
    144
    145
    146
    147
    148
    149
    150
    151
    152
    153
    154
    155
    156
    157
    158
    159
    160
    161
    162
    163
    164
    165
    166
    167
    168
    169
    170
    171
    172
    173
    174
    175
    176
    177
    178
    179
    180
    181
    182
    183
    184
    185
    186
    187
    188
    189
    190
    191
    192
    193
    194
    195
    196
    197
    198
    199
    200
    201
    202
    203
    204
    205
    206
    207
    208
    209
    210
    211
    212
    213
    214
    215
    216
    217
    218
    219
    220
    221
    222
    223
    224
    225
    226
    227
    228
    229
    230
    231
    232
    233
    234
    235
    236
    237
    238
    239
    class DBUtil:
    # 连接池对象
    __pool = None
    # 初始化
    def __init__(self, host, user, password, database, port=3306, charset="utf8"):
    self.host = host
    self.port = int(port)
    self.user = user
    self.password = str(password)
    self.database = database
    self.charset = charset
    self.conn = self.__get_connection()
    self.cursor = self.conn.cursor()

    def __get_connection(self):
    if DBUtil.__pool is None:
    __pool = PooledDB(
    creator=pymysql,
    mincached=1,
    maxcached=20,
    host=self.host,
    port=self.port,
    user=self.user,
    passwd=self.password,
    db=self.database,
    use_unicode=False,
    charset=self.charset,
    )
    return __pool.connection()

    def begin(self):
    """
    开启事务
    """
    self.conn.autocommit(0)

    def end(self, option="commit"):
    """
    结束事务
    """
    if option == "commit":
    self.conn.commit()
    else:
    self.conn.rollback()

    def connect(self):
    """
    直接连接数据库
    :return conn: pymysql连接
    """
    try:
    conn = pymysql.connect(
    host=self.host,
    user=self.user,
    password=self.password,
    database=self.database,
    port=self.port,
    use_unicode=True,
    charset=self.charset,
    cursor=DictCursor,
    )
    return conn
    except Exception as e:
    logging.error("Error connecting to mysql server.")
    raise

    # 关闭数据库连接
    def close(self):
    try:
    self.cursor.close()
    self.conn.close()
    except Exception as e:
    logging.error("Error closing connection to mysql server")

    # 查询操作,查询单条数据
    def get_one(self, sql):
    # res = None
    try:
    # self.connect()
    self.cursor.execute(sql)
    res = self.cursor.fetchone()
    self.close()
    return res
    except Exception as e:
    raise

    # 查询操作,查询多条数据
    def get_all(self, sql):
    try:
    # self.connect()
    self.cursor.execute(sql)
    res = self.cursor.fetchall()
    self.close()
    return res
    except Exception as e:
    raise

    # 查询数据库对象
    def get_all_obj(self, sql, table_name, *args):
    resList = []
    fields_list = []
    try:
    if len(args) > 0:
    for item in args:
    fields_list.append(item)
    else:
    fields_sql = (
    "select COLUMN_NAME from information_schema.COLUMNS where table_name = '%s' and table_schema = '%s'"
    % (table_name, self.conn_name)
    )
    fields = self.get_all(fields_sql)
    for item in fields:
    fields_list.append(item[0])

    # 执行查询数据sql
    res = self.get_all(sql)
    for item in res:
    obj = {}
    count = 0
    for x in item:
    obj[fields_list[count]] = x
    count += 1
    resList.append(obj)
    return resList
    except Exception as e:
    raise

    def insert(self, sql, params=None):
    """
    插入操作
    :return count: 影响的行数
    """
    return self.__edit(sql, params)

    def update(self, sql, params=None):
    """
    更新操作
    :return count: 影响的行数
    """
    return self.__edit(sql, params)

    def delete(self, sql, params=None):
    """
    删除操作
    :return count: 影响的行数
    """
    return self.__edit(sql, params)

    def __edit(self, sql, params=None):
    max_attempts = 3
    attempt = 0
    count = 0
    while attempt < max_attempts:
    try:
    self.conn = self.__get_connection()
    self.cursor = self.conn.cursor()
    if params is None:
    count = self.cursor.execute(sql)
    else:
    count = self.cursor.execute(sql, params)
    self.conn.commit()
    self.close()
    except Exception as e:
    logging.error(e)
    self.conn.rollback()
    count = 0
    return count

    def execute(self, sql, params=None):
    max_attempts = 3
    attempt = 0
    while attempt < max_attempts:
    try:
    self.conn = self.__get_connection()
    self.cursor = self.conn.cursor()
    if params is None:
    result = self.cursor.execute(sql)
    else:
    result = self.cursor.execute(sql, params)
    self.conn.commit()
    self.close()
    return result
    except Exception as e:
    attempt += 1
    logging.warning(f"retry: {attempt}")
    logging.exception(e)

    def truncate(self, table):
    sql = f"truncate table {table}"
    self.execute(sql)

    def executemany(self, sql, data):
    max_attempts = 3
    attempt = 0
    while attempt < max_attempts:
    try:
    self.conn = self.__get_connection()
    self.cursor = self.conn.cursor()
    result = self.cursor.executemany(sql, data)
    self.conn.commit()
    self.close()
    return result
    except Exception as e:
    attempt += 1
    logging.warning(f"retry: {attempt}")
    logging.exception(e)


    config_filepath = os.path.dirname(os.path.dirname(__file__)) + "/config/dev.ini"
    config = Config(config_filepath).get_content("MYSQL")
    conn = DBUtil(
    host=config["HOST"],
    port=int(config["PORT"]),
    database=config["DATABASE"],
    user=config["USER"],
    password=config["PASSWORD"],
    charset=config["CHARSET"],
    )


    def get_connection():
    return conn


    if __name__ == "__main__":
    db = get_connection()
    # 使用 cursor() 方法创建一个游标对象 cursor
    cursor = db.cursor

    # 使用 execute() 方法执行 SQL 查询
    cursor.execute("SELECT VERSION()")

    # 使用 fetchone() 方法获取单条数据.
    data = cursor.fetchone()
    print("Database version : %s " % data)

    # 关闭数据库连接
    db.close()

windows下配置pyspark

安装和配置spark

  • Spark版本: spark-3.0.3-bin-without-hadoop

  • Hadoop版本: hadoop-3.2.2

  • 下载winutils(链接), 将winutils和hadoop.dll放入HADOOP_HOME/bin目录下

  • 在SPARK_HOME/conf下新建spark-env.cmd, 添加以下内容, 解决spark-shell启动时找不到log4j的错误

    1. 第1行: 关闭命令回显
    2. 第2行: 设置spark错误提示信息为中文

    3. 第3行: 设置spark寻找hadoop的jar包的路径

1
2
3
@echo off
set JAVA_TOOL_OPTIONS="-Duser.language=en"
FOR /F %%i IN ('hadoop classpath') DO @set SPARK_DIST_CLASSPATH=%%i

解决vscode找不到pyspark或其他自定义包

  • 如果提示找不到py4j,可以把SPARK_HOME/python/lib/py4j-0.10.9-src.zip加入pythonpath中

  • 在.vscode的settings.json中, 添加以下内容, 解决terminal运行python程序时找不到自定义模块的问题

1
2
3
4
5
"python.autoComplete.extraPaths": ["${workspaceFolder}", "${env:SPARK_HOME}/python"],
"terminal.integrated.env.windows": {
"PYTHONPATH": "${env:SPARK_HOME}/python/lib/py4j-0.10.9-src.zip;${env:SPARK_HOME}/python;${workspaceFolder};${env:PYTHONPATH};",
"SPARK_HOME":"${env:SPARK_HOME}"
}
  • 在.vscode的launch.json中, 添加以下内容, 解决调试模式找不到自定义模块的问题
1
2
"env": {"PYTHONPATH":"${workspaceFolder};${env:SPARK_HOME}/python/lib/py4j-0.10.9-src.zip;${env:SPARK_HOME/python};${env:PYTHONPATH}",
"SPARK_HOME": "${env:SPARK_HOME}"}

说明

python.autoComplete.extraPaths指定vscode的python插件寻找其他包的位置, terminal.integrated.env.windows指定terminal的pythonpath环境变量, pycharm不会报找不到自定义包的原因是pycharm会自动把项目路径添加到pythonpath环境变量中, vscode则需要手动添加

在windows11的wsl2中安装cuda

官方指南

在windows下安装支持wsl2的nvidia驱动

下载地址
注意: 不要在wsl中安装显卡驱动

安装cuda

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# 以11.4为例
# 方法1
wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin
sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.4.0/local_installers/cuda-repo-wsl-ubuntu-11-4-local_11.4.0-1_amd64.deb
sudo dpkg -i cuda-repo-wsl-ubuntu-11-4-local_11.4.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-wsl-ubuntu-11-4-local/7fa2af80.pub
sudo apt-get update
sudo apt-get -y install cuda

# 方法2
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.4.0/local_installers/cuda-repo-ubuntu2004-11-4-local_11.4.0-470.42.01-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2004-11-4-local_11.4.0-470.42.01-1_amd64.deb
sudo apt-key add /var/cuda-repo-ubuntu2004-11-4-local/7fa2af80.pub
sudo apt-get update
apt-get install -y cuda-toolkit-11-4

# 方法3
sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub
sudo sh -c 'echo "deb http://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 /" > /etc/apt/sources.list.d/cuda.list'
sudo apt-get update
sudo apt-get install -y cuda-toolkit-11-4

安装cudnn

下载地址
注意: 版本需要与cuda匹配

1
2
3
4
tar -xzvf cudnn-x.x-linux-x64-v8.x.x.x.tgz
sudo cp cuda/include/cudnn*.h /usr/local/cuda/include
sudo cp -P cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn*.h /usr/local/cuda/lib64/libcudnn*

安装docker(可选)

1
2
3
4
5
6
7
8
9
10
11
12
curl https://get.docker.com | sh
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
curl -s -L https://nvidia.github.io/libnvidia-container/experimental/$distribution/libnvidia-container-experimental.list | sudo tee /etc/apt/sources.list.d/libnvidia-container-experimental.list
sudo apt-get update
sudo apt-get install -y nvidia-docker2
sudo service docker stop
sudo service docker start
sudo service docker stop
sudo service docker start
docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark

安装pytorch

1
2
3
4
5
6
# stable(1.10)
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
# LTS(1.8.2)
conda install pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch-lts -c nvidia

python -c "import torch;print(torch.cuda.is_available())"

mysql 权限配置

删除root权限如何恢复

安全模式启动mysql

1
2
3
4
5
6
7
# 安全模式启动mysql服务
vim /etc/mysql/mysql.conf.d/mysqld.cnf
# 在[mysql]处添加下述代码
[mysql]
skip-grant-tables
# 重启mysql服务
sudo service mysql restart

恢复权限

1
2
3
4
5
6
7
# 报错使用flush privileges命令
mysql> grant all privileges on *.* to root@"localhost";
ERROR 1290 (HY000): The MySQL server is running with the --skip-grant-tables option so it cannot execute this statement
mysql> flush privileges;
# 重新赋权
mysql> grant all privileges on *.* to root@"localhost";
mysql> flush privileges;

还原配置文件

1
2
3
4
5
6
7
vim /etc/mysql/mysql.conf.d/mysqld.cnf

[mysql]
# skip-grant-tables

# 重启mysql服务
sudo service mysql restart

无密码登录问题

1
2
3
4
5
6
7
8
9
10
11
12
use mysql;

# mysql > 5.7
update user set password=password('PASSWORD') where user='root';

# mysql <= 5.7
update user set authentication_string=password("PASSWORD") where user='root';
# 更改密码认证方式
update user set plugin="mysql_native_password";

# 刷新
flush privileges;

权限管理常用命令

创建用户

1
2
3
CREATE USER 'user1'@'host' IDENTIFIED BY '123456';
CREATE USER 'user2'@'%' IDENTIFIED BY '';
GRANT SELECT, INSERT ON test.user TO 'user'@'%';

查看密码认证方式

1
SELECT user,host,plugin from mysql.user where user='root';

修改密码认证方式

1
2
alter user 'user'@'host' identified with mysql_native_password by 'pa';
flush privileges;

修改密码

1
2
3
4
5
use mysql;
# mysql version <= 5.7.5
SET PASSWORD FOR 'user'@'host' = PASSWORD('newpassword');
# mysql version > 5.7.5
SET PASSWORD FOR 'user'@'host' = PASSWORD('newpassword');

删除用户

1
DROP USER 'user'@'host';

grant 赋权

1
2
3
4
5
6
7
# privileges可省略
mysql> grant all on *.* to 'user'@'ip' identified by "password";
# 192.168.1.%表示一个网段
mysql> grant all privileges on *.* to user@'192.168.1.%' identified by "123456";
mysql> grant insert,select,update,delete,drop,create,alter on 'database'.'table' to user@'%' identified by "123456";
# 刷新
mysql> flush privilege

revoke 撤销权限

1
2
3
4
mysql> revoke all on *.* from user@'ip';
mysql> revoke all privileges on *.* from user@'ip';
mysql> revoke insert,select,update,delete,drop,create,alter on database.table from user@'%';
mysql> flush privileges;

查看权限

1
2
3
4
5
# 查看权限
show grants;
show grants for user@'%';
# 查看user和ip
SELECT User, Host, plugin FROM mysql.user;

云服务器环境配置

docker

自动安装

1
2
3
# 一键安装脚本, 2选1
curl -fsSL https://get.docker.com | bash -s docker --mirror Aliyun
curl -sSL https://get.daocloud.io/docker | sh

手动安装

卸载旧版本

1
sudo apt-get remove docker docker-engine docker.io containerd runc

安装依赖包

1
2
3
4
5
6
sudo apt-get install \
apt-transport-https \
ca-certificates \
curl \
gnupg-agent \
software-properties-common

添加密钥

1
2
3
curl -fsSL https://mirrors.ustc.edu.cn/docker-ce/linux/ubuntu/gpg | sudo apt-key add -
# 验证密钥
sudo apt-key fingerprint 0EBFCD88

添加docker安装源

1
2
3
4
sudo add-apt-repository \
"deb [arch=amd64] https://mirrors.ustc.edu.cn/docker-ce/linux/ubuntu/ \
$(lsb_release -cs) \
stable"

安装 Docker Engine-Community

1
2
3
4
5
6
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io
# 安装特定版本
# 列出版本
apt-cache madison docker-ce
sudo apt-get install docker-ce=<VERSION_STRING> docker-ce-cli=<VERSION_STRING> containerd.io

mysql

安装mysql安装源

1
2
# [MySQL APT Repository](https://dev.mysql.com/downloads/repo/apt/)
dpkg -i mysql-apt-config_0.8.20-1_all.deb

安装mysql

1
2
sudo apt update
sudo apt install mysql-server

配置mysql

1
sudo mysql_secure_installation

配置远程访问

1
2
3
4
5
sudo mysql -uroot -p
> use mysql
> update user set host='%' where user ='root';
> FLUSH PRIVILEGES;
> GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY "PASSWORD";

nginx

安装依赖

1
2
3
4
5
6
7
8
9
sudo apt-get update
#安装依赖:gcc、g++依赖库
sudo apt-get install build-essential libtool
#安装 pcre依赖库(http://www.pcre.org/)
sudo apt-get install libpcre3 libpcre3-dev
#安装 zlib依赖库(http://www.zlib.net)
sudo apt-get install zlib1g-dev
#安装ssl依赖库
sudo apt-get install openssl

编译安装nginx

1
2
3
4
5
6
7
# https://nginx.org/en/download.html
wget http://nginx.org/download/nginx-1.20.1.tar.gz
tar zxvf nginx-1.20.1.tar.gz
cd nginx-1.20.1.7
sudo ./configure --prefix=/usr --sbin-path=/usr/sbin/nginx --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx/nginx.pid --lock-path=/var/lock/nginx.lock
sudo make
sudo make install

常用命令

1
2
3
4
5
6
7
8
# 启动
sudo nginx
# 查看Nginx进程
ps -ef|grep nginx
# 常用
sudo nginx -s stop
sudo nginx -s quit
sudo nginx -s reload

apache2

安装apache2

1
sudo apt install apache2

启用ssl

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
mkdir /etc/apache2/ssl
# 复制证书文件
cp -r YourDomainName_public.crt /etc/apache2/ssl
cp -r YourDomainName_chain.crt /etc/apache2/ssl
cp -r YourDomainName.key /etc/apache2/ssl
# 启动ssl模块
sudo a2enmod ssl
# 编辑配置文件
sudo vim /etc/apache2/sites-available/default-ssl.conf
#######
<IfModules mod_ssl.c>
<VirtualHost *:443>
ServerName #修改为证书绑定的域名www.YourDomainName.com。
SSLCertificateFile /etc/apache2/ssl/www.YourDomainName_public.crt #将/etc/apache2/ssl/www.YourDomainName.com_public.crt替换为证书文件路径+证书文件名。
SSLCertificateKeyFile /etc/ssl/apache2/www.YourDomainName.com.key #将/etc/apache2/ssl/www.YourDomainName.com.key替换为证书密钥文件路径+证书密钥文件名。
SSLCertificateChainFile /etc/apache2/ssl/www.YourDomainName.com_chain.crt #将/etc/apache2/ssl/www.YourDomainName.com_chain.crt替换为证书链文件路径+证书链文件名。
#######

# 重新加载
sudo /etc/init.d/apache2 force-reload
# 重启服务
sudo /etc/init.d/apache2 restart

配置反向代理

1
2
3
sudo a2enmod rewrite
sudo a2enmod proxy
sudo a2enmod proxy_http

设计模式-单例模式

单例模式仅创建一个实例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
package designpatterns;
/**
* 双检锁单例模式 懒汉式 线程安全 高性能
*/
public class Singleton {
/**
* volatile 关键字保证实例在多线程中的可见性以及防止指令重排序 创建实例的过程可以被分解成3个步骤 1. memery = allocate();
* // 分配内存空间 2. initialize(memery); // 初始化内存空间 3. instance = memery; //
* 将内存地址赋值给instance
*/
private static volatile Singleton instance;

/**
* 构造函数私有化防止被调用
*/
private Singleton() {

}

public static Singleton getInstance() {
if (instance == null) {
synchronized (Singleton.class) {
if (instance == null) {
instance = new Singleton();
}
}
}
return instance;
}

public static void main(String[] args) {
MyThread[] threads = new MyThread[5];
for (int i = 0; i < 5; i++) {
threads[i] = new MyThread(String.format("Thread-%d", i + 1));
threads[i].start();
}
}
}

class MyThread extends Thread {
String name;

public MyThread(String name) {
this.name = name;
}

@Override
public void run() {
Singleton instance = Singleton.getInstance();
System.out.println(this.name + ":" + instance.hashCode());
}
}

常见问题与注意事项

  • wsl默认与windows共享环境变量, 如果在windows中已经配置过hadoop环境,在wsl中可能会出错,所以需要禁止wsl与windows共享环境变量
1
2
3
4
5
6
7
8
sudo vim /etc/wsl.conf
# 将下面的内容添加到wsl.conf中

[interop]
appendWindowsPath = false

#重启wsl, 在powershell中运行以下命令
wslconfig /t Ubuntu

配置JAVA环境

版本兼容性问题

  1. hadoop 3.3及以上版本 支持java 8和java 11
  2. hadoop 3.0.x - 3.2.x 支持java 8
  3. hadoop 2.7.x - 2.10.x 支持java 7和java 8

下载openjdk 8

版本: adopt-openjdk-8u302
下载地址

解压到指定目录

1
2
3
4
5
# 创建目录
mkdir /home/jzy/opt
# -x 解压; -v 显示过程; -z 有gz属性的; -f 使用档案名
# --strip-components 数字 去除目录结构,为1表示解压第一个目录下的所有文件
sudo tar -xvzf --strip-components 1 OpenJDK8U-jdk_x64_linux_hotspot_8u302b08.tar.gz -C /home/jzy/opt/jdk8

配置Java环境变量

1
2
3
4
5
6
7
8
9
10
11
vim ~/.bashrc

# 文件末尾写入以下内容
export JAVA_HOME=/home/jzy/opt/jdk8
export PATH=$PATH:$JAVA_HOME/bin

# 应用设置
source ~/.bashrc

# 查看jdk版本
java -version

安装Hadoop

安装依赖包

1
2
3
sudo apt-get install ssh
sudo apt-get install pdsh
sudo apt-get install openssh-client openssh-server

解压到指定目录

1
2
3
# -x 解压; -v 显示过程; -z 有gz属性的; -f 使用档案名
# --strip-components 数字 去除目录结构,为1表示解压第一个目录下的所有文件
tar -xvzf hadoop-2.10.1.tar.gz -C /home/jzy/opt

配置Hadoop环境变量

  • 配置hadoop-env.sh
    1
    2
    3
    vim /home/jzy/opt/hadoop-2.10.1/etc/hadoop/hadoop-env.sh
    # 空白位置加入以下内容
    export JAVA_HOME=/home/jzy/opt/jdk8
  • 配置.bashrc
1
2
3
4
5
6
7
8
9
10
vim ~/.bashrc
# 文件末尾加入以下内容
export HADOOP_HOME=/home/jzy/opt/hadoop-3.2.2
export PATH=$PATH:$HADOOP_HOME/bin

# 应用设置
source ~/.bashrc
# 查看版本
hadoop v

单机伪分布式部署hadoop

配置hdfs

  • 配置core-site.xml

    1
    vim /home/jzy/opt/hadoop-2.10.1/etc/hadoop/core-site.xml
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    <configuration>
    <property>
    <name>fs.defaultFS</name>
    <value>hdfs://localhost:9000</value>
    </property>
    <property>
    <name>hadoop.tmp.dir</name>
    <value>/home/jzy/hadoop/tmp</value>
    </property>
    </configuration>
  • 配置hdfs-site.xml

    1
    vim /home/jzy/opt/hadoop-2.10.1/etc/hadoop/hdfs-site.xml
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    <configuration>
    <property>
    <name>dfs.replication</name>
    <value>1</value>
    </property>
    <property>
    <name>dfs.permissions</name>
    <value>false</value>
    </property>
    <property>
    <name>dfs.namenode.name.dir</name>
    <value>file:/home/jzy/hadoop/name</value>
    </property>
    <property>
    <name>dfs.datanode.data.dir</name>
    <value>file:/home/jzy/hadoop/data</value>
    </property>
    </configuration>

配置yarn

  • mapred-site.xml
    1
    2
    3
    cd /home/jzy/opt/hadoop-2.10.1/etc/hadoop/
    cp mapred-site.xml.template mapred-site.xml
    vim mapred-site.xml
    1
    2
    3
    4
    5
    6
    <configuration>
    <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
    </property>
    </configuration>
  • yarn-site.xml
1
2
3
4
5
6
7
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>

配置无密码登录

hadoop不提供输入密码的方式访问集群中的节点, 因此需要配置无密码登录

1
2
3
ssh-keygen –t rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
ssh localhsot

SSH常见问题

  • ssh localhost无法连接
    • 尝试重启ssh服务
      1
      sudo service ssh restart
    • 重装ssh服务
      1
      2
      sudo apt install --reinstall openssh-client
      sudo apt install --reinstall openssh-service

启动Hadoop

  1. 格式化namenode

    1
    hdfs namenode -format
  2. 启动hdfs: sbin/start-dfs.sh

    1
    2
    cd /home/jzy/opt/hadoop-2.10.1
    sbin/start-dfs.sh
  3. 在浏览器里查看dfs: http://localhost:50070

  4. 启动YARN

    1
    sbin/start-yarn.sh
  5. 在浏览器里查看YARN: http://localhost:8088

  6. jps命令检查启动是否成功

    1
    2
    3
    4
    5
    6
    7
    8
    # 正常情况
    872 Jps
    746 SecondaryNameNode
    522 DataNode
    333 NameNode
    # 启动了YARN
    947 ResourceManager
    1097 NodeManager

常见问题

  • 排查故障
    1. 首先用jps确定哪个服务没有启动
    2. 然后根据服务名定位日志文件, 日志位于hadoop安装目录下的logs文件夹内
      1
      2
      cd /home/jzy/hadoop-2.10.1/logs
      cat
  • namenode未格式化成功,启动失败
    • 删除/home/hadoop/name目录重新格式化
      1
      2
      sudo rm -rf /home/hadoop/name
      hdfs namenode -format

配置Scala

下载scala

scala 2.12.14与spark 3.1.2 兼容
下载地址

解压

1
tar -xvzf scala-2.12.14.tgz -C /home/jzy/opt/

环境变量

1
2
3
4
vim ~/.bashrc
# 在.bashrc中加入下方内容
export SCALA_HOME=/home/jzy/opt/scala-2.12.14
export PATH=$PATH:$SCALA_HOME/bin

配置Spark

下载Spark

选择版本 3.1.2 pre-built with user-provided hadoop
可与hadoop 2.10.1兼容
下载地址

解压

1
tar -xvzf spark-3.1.2-bin-without-hadoop.tgz -C /home/jzy/opt/

配置Spark与Hadoop关联

  • conf/spark-env.sh
    1
    2
    # 添加一行
    export SPARK_DIST_CLASSPATH=$(hadoop classpath)

mysql

  1. 安装apt源
  2. 安装mysql
  3. 安装jdbc驱动
    /usr/share/java/mysql-connector-java-8.0.26.jar

HBase

  • ~/.bashrc

    1
    2
    export HBASE_HOME=/home/jzy/opt/hbase-2.4.5
    export PATH=$PATH:$HBASE_HOME/bin
  • conf/hbase-site.xml

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    <configuration>
    <property>
    <name>hbase.cluster.distributed</name>
    <value>false</value>
    </property>
    <property>
    <name>hbase.tmp.dir</name>
    <value>/home/jzy/hbase/tmp</value>
    </property>
    <property>
    <name>hbase.rootdir</name>
    <value>hdfs://localhost:9000/hbase</value>
    </property>
    <property>
    <name>hbase.unsafe.stream.capability.enforce</name>
    <value>false</value>
    </property>
    </configuration>

Hive

  • ~/.bashrc

    1
    2
    export HIVE_HOME=/home/jzy/opt/hive-2.4.9
    export PATH=$PATH:$HIVE_HOME/bin
  • conf/hive-site.xml

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    <configuration>
    <property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
    </property>
    <property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.cj.jdbc.Driver</value>
    </property>
    <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>root</value>
    </property>
    <property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>0731</value>
    </property>
    <property>
    <name>hive.metastore.schema.verification</name>
    <value>false</value>
    </property>
    </configuration>

注册场景下的密码加盐

加盐

为了防止密码明文被窃取,服务器存储加盐并加密后的密码
为了防止密码被解密,采用不可逆的md5摘要算法
$md5(md5(password)+salt)$
salt是为每个用户单独随机生成的字符串,需要在数据库中保存
前端首先用md5对密码进行加密,后端收到md5加密的密码后从数据库中取盐,和收到的密码拼接后再通过md5加密,并与数据库中存储的加密字符串比对

http环境下可以用非对称加密

用rsa算法,服务端生成公钥和私钥,公钥发送给前端,前端用公钥对密码进行加密,服务端用私钥进行解密

使用https

知乎 - 彻底搞懂HTTPS的加密原理

PasswordUtil工具类实现md5

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
import java.security.MessageDigest;
import java.util.Random;

public class PasswordUtil {

private static final char[] HEX = {'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F'};
private static final int SALT_LENGTH = 4;

/**
* 自定义简单生成盐,是一个随机生成的长度为16的字符串,每一个字符是随机的十六进制字符
*/
public static String getSalt() {
Random random = new Random();
StringBuilder sb = new StringBuilder(SALT_LENGTH);
for (int i = 0; i < sb.capacity(); i++) {
sb.append(HEX[random.nextInt(SALT_LENGTH)]);
}
return sb.toString();
}

private static String byte2HexStr(byte[] bytes) {
StringBuilder result = new StringBuilder();
//两个字节为一个字符 2进制转16进制
for (byte byte0 : bytes) {
result.append(HEX[byte0 >>> 4 & 0xf]);
result.append(HEX[byte0 & 0xf]);
}
return result.toString();
}

public static String MD5WithSalt(String inputStr, String salt) {
try {
//申明使用MD5算法,更改参数为"SHA"就是SHA算法了
MessageDigest md = MessageDigest.getInstance("MD5");
//加盐,输入加盐
String inputWithSalt = inputStr + salt;
System.out.println("明文:" + inputStr+" 盐:"+salt);
//哈希计算,转换输出
String hashResult = byte2HexStr(md.digest(inputWithSalt.getBytes()));
System.out.println("加盐密文:" + hashResult);

return hashResult;
} catch (Exception e) {
e.printStackTrace();
return e.toString();
}
}

}