{ "cells": [ { "cell_type": "code", "execution_count": 1, "id": "initial_id", "metadata": { "ExecuteTime": { "end_time": "2025-02-08T12:03:10.007903Z", "start_time": "2025-02-08T12:03:08.375866Z" } }, "outputs": [], "source": [ "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "from catboost import Pool, CatBoostRegressor" ] }, { "cell_type": "markdown", "id": "698ef0c6-d381-40b3-93f8-74b6904ba9a4", "metadata": {}, "source": [ "### Можно ли решить задачу с наскока?\n", "\n", "Сперва стоит подгрузить данные и посмотреть на то, с чем нам придётся работать.\n", "\n", "В этои соревновании участникам предоставляются 4 набора из пар файлов (train и test):\n", "1. train_main.parquet (279 признаков + ключ)\n", "2. train_card_spending.parquet (630 признаков + ключ)\n", "3. train_mcc_operations.parquet (1640 признаков + ключ)\n", "4. train_mcc_preferences.parquet (2112 признаков + ключ)\n", "\n", "Для простоты, попробуем использовать первый набор (*_main.parquet*).\n", "\n", "#### Данные с признаками (1 из 4):\n" ] }, { "cell_type": "code", "execution_count": 2, "id": "44cafd85-2c29-4d73-b7ac-09b6fa32907f", "metadata": {}, "outputs": [], "source": [ "train = pd.read_parquet('data/task3/train_main.parquet')\n", "test = pd.read_parquet('data/task3/test_main.parquet')" ] }, { "cell_type": "code", "execution_count": 3, "id": "e12cd977-b91b-42e1-a448-cf04d1d209e6", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Данные для обучения: (213345, 280)\n", "Тестовые данные: (318451, 280)\n" ] } ], "source": [ "print('Данные для обучения:', train.shape)\n", "print('Тестовые данные:', test.shape)" ] }, { "cell_type": "code", "execution_count": 4, "id": "7315a29f-f3ef-424b-8543-ee571f124d64", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
user_idapp_children_cntapp_dependent_cntapp_family_cntapp_income_appapp_real_estate_indapp_vehicle_indavg_dep_avg_balance_12month_amtavg_dep_avg_balance_12month_amt_termavg_dep_avg_balance_12month_amt_term_savings...savings_sum_oms_debet_3msavings_sum_oms_debet_6msavings_sum_oms_debet_9msavings_sum_oms_debet_12msavings_service_model_cdsavings_pension_flgsavings_deposit_flgsavings_safe_acc_flgsavings_broker_flgsavings_oms_flg
09NaNNaNNaNNaNNaNNaNNaNNaN4027.373535...34.6132160.0000004.31041475.214180Массовый00100
111NaNNaNNaNNaNNaNNaNNaNNaNNaN...6.2376720.0000000.0000000.000000Массовый00100
2120.00.00.0105372.9609380.00.0315208.781250NaN274816.375000...0.00000053.13412990.0252380.000000Массовый00100
3131.00.03.00.0000000.00.043187.953125NaN5277.233887...0.00000022.27611482.070015117.386795Массовый00100
415NaNNaNNaNNaN0.00.0NaNNaN0.000000...0.0000000.0000000.00000033.072178Массовый00100
\n", "

5 rows × 280 columns

\n", "
" ], "text/plain": [ " user_id app_children_cnt app_dependent_cnt app_family_cnt \\\n", "0 9 NaN NaN NaN \n", "1 11 NaN NaN NaN \n", "2 12 0.0 0.0 0.0 \n", "3 13 1.0 0.0 3.0 \n", "4 15 NaN NaN NaN \n", "\n", " app_income_app app_real_estate_ind app_vehicle_ind \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 105372.960938 0.0 0.0 \n", "3 0.000000 0.0 0.0 \n", "4 NaN 0.0 0.0 \n", "\n", " avg_dep_avg_balance_12month_amt avg_dep_avg_balance_12month_amt_term \\\n", "0 NaN NaN \n", "1 NaN NaN \n", "2 315208.781250 NaN \n", "3 43187.953125 NaN \n", "4 NaN NaN \n", "\n", " avg_dep_avg_balance_12month_amt_term_savings ... \\\n", "0 4027.373535 ... \n", "1 NaN ... \n", "2 274816.375000 ... \n", "3 5277.233887 ... \n", "4 0.000000 ... \n", "\n", " savings_sum_oms_debet_3m savings_sum_oms_debet_6m \\\n", "0 34.613216 0.000000 \n", "1 6.237672 0.000000 \n", "2 0.000000 53.134129 \n", "3 0.000000 22.276114 \n", "4 0.000000 0.000000 \n", "\n", " savings_sum_oms_debet_9m savings_sum_oms_debet_12m \\\n", "0 4.310414 75.214180 \n", "1 0.000000 0.000000 \n", "2 90.025238 0.000000 \n", "3 82.070015 117.386795 \n", "4 0.000000 33.072178 \n", "\n", " savings_service_model_cd savings_pension_flg savings_deposit_flg \\\n", "0 Массовый 0 0 \n", "1 Массовый 0 0 \n", "2 Массовый 0 0 \n", "3 Массовый 0 0 \n", "4 Массовый 0 0 \n", "\n", " savings_safe_acc_flg savings_broker_flg savings_oms_flg \n", "0 1 0 0 \n", "1 1 0 0 \n", "2 1 0 0 \n", "3 1 0 0 \n", "4 1 0 0 \n", "\n", "[5 rows x 280 columns]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "train.head(n = 5)" ] }, { "cell_type": "markdown", "id": "607e32ae-f33c-45f4-ae18-93338a822986", "metadata": {}, "source": [ "#### Особенности данных\n", "\n", "Сразу видно очень много пропусков. \n", "\n", "Наверное с пропусками даже придётся что-то делать, ведь где-то их 50% или больше." ] }, { "cell_type": "code", "execution_count": 5, "id": "f0e01e42-4e99-4ab1-86e4-dabe2cd3730b", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Columns with missing values:\n", " vehicle_counrty_type_nm 206739\n", "max_amt_foreign_cur_5y 201136\n", "max_amt_dep_6m 182526\n", "min_amt_term_g1y 180330\n", "max_amt_dep_act 175074\n", " ... \n", "cnt_foreign_cur_5y 26\n", "cnt_save_5y 26\n", "cnt_grow_5y 26\n", "cnt_term_g1y 26\n", "cnt_manage_5y 26\n", "Length: 167, dtype: int64\n" ] } ], "source": [ "missing_values = train.isnull().sum().sort_values(ascending = False)\n", "missing_values = missing_values[missing_values > 0]\n", "print('\\nColumns with missing values:\\n', missing_values)" ] }, { "cell_type": "markdown", "id": "19e0c8d0-60a1-47fe-93a6-a9d810471ead", "metadata": {}, "source": [ "Интересно, а в тестовых данных же всё точно так же? \n", "\n", "Наверное, нужно будет это проверить 🤔️️️️️️\n", "\n", "Для некоторых алгоритмов будет неловко, если появятся новые признаки с пропусками, которых не было в обучающих данных 👀️️️️️️" ] }, { "cell_type": "markdown", "id": "31e49b25-225b-4d5f-997d-923e81a7f3a3", "metadata": {}, "source": [ "#### Типы данных\n", "\n", "Стоит верхнеуровнево посмотреть что находится внутри. Особенно интересуют категориальные признаки." ] }, { "cell_type": "code", "execution_count": 6, "id": "2975998a-e869-445d-ab7d-140eb7dd6c4d", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "float64 266\n", "object 8\n", "int32 6\n", "Name: count, dtype: int64" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "train.dtypes.value_counts()" ] }, { "cell_type": "markdown", "id": "1605bdb7-a901-406a-b936-39a901be144b", "metadata": {}, "source": [ "Категориальные признаки стоит отдельно выделить и сохранить для Catboost." ] }, { "cell_type": "code", "execution_count": 7, "id": "74f07ad6-0ef9-4368-9ad2-266ad9a0ac42", "metadata": {}, "outputs": [], "source": [ "features = train.columns\n", "\n", "categorical_features = train[features].select_dtypes(include=['object']).columns\n", "\n", "for feature in categorical_features:\n", " train[feature] = train[feature].astype(str)\n", "\n", "categorical_features_indices = np.where(train.dtypes == 'object')[0]" ] }, { "cell_type": "markdown", "id": "d9a7e087-20a0-45e3-a94a-a1e2bc7be6d1", "metadata": {}, "source": [ "Для первого подхода к снаряду, этого хватит. Не хватает только целевой переменной.\n", "\n", "### Целевая переменная\n", "\n" ] }, { "cell_type": "code", "execution_count": 8, "id": "25261de2-03f1-41d6-8830-a50bdc0dd061", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(213345, 2)" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "target = pd.read_csv('data/task3/train_target.csv')\n", "target.shape" ] }, { "cell_type": "code", "execution_count": 9, "id": "beea57de-e043-44ce-b01e-ebd201017b84", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
user_idtarget
090.00000
1110.00000
212219932.90625
313631.77002
4150.00000
\n", "
" ], "text/plain": [ " user_id target\n", "0 9 0.00000\n", "1 11 0.00000\n", "2 12 219932.90625\n", "3 13 631.77002\n", "4 15 0.00000" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "target.head(5)" ] }, { "cell_type": "markdown", "id": "9e823601-0db6-485a-9f39-bb31f9ab6bb7", "metadata": {}, "source": [ "Мы решаем задачу регрессии. Из описания соревнования, нам требуется предсказать:\n", "\n", "> 50 перцентиль распределения суммарных остатков на всех накоп.ительных счетах клиента на горизонте +2 мес. от отчетной даты\n", "\n" ] }, { "cell_type": "code", "execution_count": 10, "id": "bf91b2ba-0307-4b19-8420-f2beae248e53", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "count 2.133450e+05\n", "mean 2.210490e+05\n", "std 9.894988e+05\n", "min -7.100000e-01\n", "25% 0.000000e+00\n", "50% 3.174000e+01\n", "75% 1.000027e+05\n", "max 1.015605e+08\n", "Name: target, dtype: float64" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "target['target'].describe()" ] }, { "cell_type": "markdown", "id": "99e823e8-0f67-40df-b932-c5c4b603f6e0", "metadata": {}, "source": [ "Целевая переменная точно требует ее преобразовать. Посмотрим как она выглядит после log1p:" ] }, { "cell_type": "code", "execution_count": 11, "id": "f3ae0ce4-1295-4903-af59-b86ba61ff3ec", "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAA1EAAAFfCAYAAACvEEbzAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjAsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvlHJYcgAAAAlwSFlzAAAPYQAAD2EBqD+naQAAOV1JREFUeJzt3X9UVPeB///XlB8TZOEWRWacDSY0yxIJNE0wi2ga3aqgFWlOe6op3aluLJolkU6F9Ue7/dTtacFf1WzLJjVZG1Njl55dYzZnVQLZJrRUUUOlDUZNdmMiroyYOg5o6AzB+/0j39ztAKIXRaM+H+fMOcy9r3vnfW9uwBfv4Y7DNE1TAAAAAIBL8olrPQAAAAAAuJ5QogAAAADABkoUAAAAANhAiQIAAAAAGyhRAAAAAGADJQoAAAAAbKBEAQAAAIAN0dd6ANfS+fPndeLECSUkJMjhcFzr4QAAAAC4RkzTVFdXlzwejz7xicHnmm7qEnXixAmlpqZe62EAAAAA+Jhoa2vTrbfeOmjmpi5RCQkJkj48UYmJidd4NAAAAACulc7OTqWmplodYTA3dYn66C18iYmJlCgAAAAAl/RnPtxYAgAAAABsoEQBAAAAgA2UKAAAAACwgRIFAAAAADZQogAAAADABkoUAAAAANhgq0R98MEH+od/+AelpaUpLi5On/rUp/S9731P58+ftzKmaWrlypXyeDyKi4vTlClTdPDgwYj9hEIhLV68WMnJyYqPj1dRUZGOHz8ekQkEAvJ6vTIMQ4ZhyOv16syZMxGZY8eOafbs2YqPj1dycrLKysoUDodtngIAAAAAuHS2StTq1av1k5/8RNXV1Tp06JDWrFmjtWvX6sc//rGVWbNmjdavX6/q6mrt379fbrdb06dPV1dXl5Xx+Xzavn27ampq1NjYqLNnz6qwsFC9vb1Wpri4WC0tLaqtrVVtba1aWlrk9Xqt9b29vZo1a5bOnTunxsZG1dTUaNu2bSovL7+c8wEAAAAAg3KYpmleariwsFAul0ubNm2yln3pS1/SiBEjtGXLFpmmKY/HI5/Pp2XLlkn6cNbJ5XJp9erVWrRokYLBoEaPHq0tW7Zo7ty5kqQTJ04oNTVVO3fuVEFBgQ4dOqTMzEw1NTUpNzdXktTU1KS8vDwdPnxYGRkZ2rVrlwoLC9XW1iaPxyNJqqmp0fz589XR0THgh+eGQiGFQiHr+UefShwMBvmwXQAAAOAm1tnZKcMwLqkb2JqJuv/++/Vf//VfevPNNyVJv/vd79TY2KjPf/7zkqSjR4/K7/crPz/f2sbpdGry5MnavXu3JKm5uVk9PT0RGY/Ho6ysLCuzZ88eGYZhFShJmjBhggzDiMhkZWVZBUqSCgoKFAqF1NzcPOD4q6qqrLcHGoah1NRUO4cPAAAAAIq2E162bJmCwaDuvPNORUVFqbe3Vz/4wQ/0la98RZLk9/slSS6XK2I7l8uld99918rExsYqKSmpX+aj7f1+v1JSUvq9fkpKSkSm7+skJSUpNjbWyvS1YsUKLVmyxHr+0UwUAAAAAFwqWyXqF7/4hZ577jn9/Oc/11133aWWlhb5fD55PB7NmzfPyjkcjojtTNPst6yvvpmB8kPJ/Cmn0ymn0znoOAAAAABgMLZK1N///d9r+fLleuihhyRJ2dnZevfdd1VVVaV58+bJ7XZL+nCWaMyYMdZ2HR0d1qyR2+1WOBxWIBCImI3q6OjQxIkTrczJkyf7vf6pU6ci9rN3796I9YFAQD09Pf1mqHDl3b58h/X1O6tmXcORAAAAAFeXrb+Jev/99/WJT0RuEhUVZd3iPC0tTW63W/X19db6cDishoYGqyDl5OQoJiYmItPe3q7W1lYrk5eXp2AwqH379lmZvXv3KhgMRmRaW1vV3t5uZerq6uR0OpWTk2PnsAAAAADgktmaiZo9e7Z+8IMfaOzYsbrrrrt04MABrV+/Xg8//LCkD99e5/P5VFlZqfT0dKWnp6uyslIjRoxQcXGxJMkwDC1YsEDl5eUaNWqURo4cqYqKCmVnZ2vatGmSpHHjxmnGjBkqKSnRxo0bJUkLFy5UYWGhMjIyJEn5+fnKzMyU1+vV2rVrdfr0aVVUVKikpIQ77QEAAAAYNrZK1I9//GN95zvfUWlpqTo6OuTxeLRo0SL9v//3/6zM0qVL1d3drdLSUgUCAeXm5qqurk4JCQlWZsOGDYqOjtacOXPU3d2tqVOnavPmzYqKirIyW7duVVlZmXUXv6KiIlVXV1vro6KitGPHDpWWlmrSpEmKi4tTcXGx1q1bN+STAQAAAAAXY+tzom40du4Fj0j8TRQAAABuJMP2OVEAAAAAcLOjRAEAAACADZQoAAAAALCBEgUAAAAANlCiAAAAAMAGShQAAAAA2ECJAgAAAAAbKFEAAAAAYAMlCgAAAABsoEQBAAAAgA2UKAAAAACwgRIFAAAAADZQogAAAADABkoUAAAAANhAiQIAAAAAGyhRAAAAAGADJQoAAAAAbKBEAQAAAIANlCgAAAAAsIESBQAAAAA2UKIAAAAAwAZKFAAAAADYQIkCAAAAABsoUQAAAABgAyUKAAAAAGygRAEAAACADbZK1O233y6Hw9Hv8eijj0qSTNPUypUr5fF4FBcXpylTpujgwYMR+wiFQlq8eLGSk5MVHx+voqIiHT9+PCITCATk9XplGIYMw5DX69WZM2ciMseOHdPs2bMVHx+v5ORklZWVKRwOD+EUAAAAAMCls1Wi9u/fr/b2dutRX18vSfryl78sSVqzZo3Wr1+v6upq7d+/X263W9OnT1dXV5e1D5/Pp+3bt6umpkaNjY06e/asCgsL1dvba2WKi4vV0tKi2tpa1dbWqqWlRV6v11rf29urWbNm6dy5c2psbFRNTY22bdum8vLyyzoZAAAAAHAxDtM0zaFu7PP59J//+Z966623JEkej0c+n0/Lli2T9OGsk8vl0urVq7Vo0SIFg0GNHj1aW7Zs0dy5cyVJJ06cUGpqqnbu3KmCggIdOnRImZmZampqUm5uriSpqalJeXl5Onz4sDIyMrRr1y4VFhaqra1NHo9HklRTU6P58+ero6NDiYmJA443FAopFApZzzs7O5WamqpgMHjBbTCw25fvsL5+Z9WsazgSAAAA4PJ1dnbKMIxL6gZD/puocDis5557Tg8//LAcDoeOHj0qv9+v/Px8K+N0OjV58mTt3r1bktTc3Kyenp6IjMfjUVZWlpXZs2ePDMOwCpQkTZgwQYZhRGSysrKsAiVJBQUFCoVCam5uvuCYq6qqrLcIGoah1NTUoR4+AAAAgJvUkEvUCy+8oDNnzmj+/PmSJL/fL0lyuVwROZfLZa3z+/2KjY1VUlLSoJmUlJR+r5eSkhKR6fs6SUlJio2NtTIDWbFihYLBoPVoa2uzccQAAAAAIEUPdcNNmzZp5syZEbNBkuRwOCKem6bZb1lffTMD5YeS6cvpdMrpdA46FgAAAAAYzJBmot599129/PLL+vrXv24tc7vdktRvJqijo8OaNXK73QqHwwoEAoNmTp482e81T506FZHp+zqBQEA9PT39ZqgAAAAA4EoaUol65plnlJKSolmz/u+GAmlpaXK73dYd+6QP/26qoaFBEydOlCTl5OQoJiYmItPe3q7W1lYrk5eXp2AwqH379lmZvXv3KhgMRmRaW1vV3t5uZerq6uR0OpWTkzOUQwIAAACAS2L77Xznz5/XM888o3nz5ik6+v82dzgc8vl8qqysVHp6utLT01VZWakRI0aouLhYkmQYhhYsWKDy8nKNGjVKI0eOVEVFhbKzszVt2jRJ0rhx4zRjxgyVlJRo48aNkqSFCxeqsLBQGRkZkqT8/HxlZmbK6/Vq7dq1On36tCoqKlRSUsJd9gAAAAAMK9sl6uWXX9axY8f08MMP91u3dOlSdXd3q7S0VIFAQLm5uaqrq1NCQoKV2bBhg6KjozVnzhx1d3dr6tSp2rx5s6KioqzM1q1bVVZWZt3Fr6ioSNXV1db6qKgo7dixQ6WlpZo0aZLi4uJUXFysdevW2T0cAAAAALDlsj4n6npn517wiMTnRAEAAOBGclU+JwoAAAAAbkaUKAAAAACwgRIFAAAAADZQogAAAADABkoUAAAAANhAiQIAAAAAGyhRAAAAAGADJQoAAAAAbKBEAQAAAIANlCgAAAAAsIESBQAAAAA2UKIAAAAAwAZKFAAAAADYQIkCAAAAABsoUQAAAABgAyUKAAAAAGygRAEAAACADZQoAAAAALCBEgUAAAAANlCiAAAAAMAGShQAAAAA2ECJAgAAAAAbKFEAAAAAYAMlCgAAAABsoEQBAAAAgA2UKAAAAACwwXaJ+t///V/9zd/8jUaNGqURI0boM5/5jJqbm631pmlq5cqV8ng8iouL05QpU3Tw4MGIfYRCIS1evFjJycmKj49XUVGRjh8/HpEJBALyer0yDEOGYcjr9erMmTMRmWPHjmn27NmKj49XcnKyysrKFA6H7R4SAAAAAFwyWyUqEAho0qRJiomJ0a5du/TGG2/ohz/8oT75yU9amTVr1mj9+vWqrq7W/v375Xa7NX36dHV1dVkZn8+n7du3q6amRo2NjTp79qwKCwvV29trZYqLi9XS0qLa2lrV1taqpaVFXq/XWt/b26tZs2bp3LlzamxsVE1NjbZt26by8vLLOB0AAAAAMDiHaZrmpYaXL1+u3/zmN/r1r3894HrTNOXxeOTz+bRs2TJJH846uVwurV69WosWLVIwGNTo0aO1ZcsWzZ07V5J04sQJpaamaufOnSooKNChQ4eUmZmppqYm5ebmSpKampqUl5enw4cPKyMjQ7t27VJhYaHa2trk8XgkSTU1NZo/f746OjqUmJjYb3yhUEihUMh63tnZqdTUVAWDwQHzuLDbl++wvn5n1axrOBIAAADg8nV2dsowjEvqBrZmol588UWNHz9eX/7yl5WSkqJ77rlHTz/9tLX+6NGj8vv9ys/Pt5Y5nU5NnjxZu3fvliQ1Nzerp6cnIuPxeJSVlWVl9uzZI8MwrAIlSRMmTJBhGBGZrKwsq0BJUkFBgUKhUMTbC/9UVVWV9fZAwzCUmppq5/ABAAAAwF6Jevvtt/Xkk08qPT1dL730kh555BGVlZXpZz/7mSTJ7/dLklwuV8R2LpfLWuf3+xUbG6ukpKRBMykpKf1ePyUlJSLT93WSkpIUGxtrZfpasWKFgsGg9Whra7Nz+AAAAACgaDvh8+fPa/z48aqsrJQk3XPPPTp48KCefPJJfe1rX7NyDocjYjvTNPst66tvZqD8UDJ/yul0yul0DjoOAAAAABiMrZmoMWPGKDMzM2LZuHHjdOzYMUmS2+2WpH4zQR0dHdaskdvtVjgcViAQGDRz8uTJfq9/6tSpiEzf1wkEAurp6ek3QwUAAAAAV4qtEjVp0iQdOXIkYtmbb76p2267TZKUlpYmt9ut+vp6a304HFZDQ4MmTpwoScrJyVFMTExEpr29Xa2trVYmLy9PwWBQ+/btszJ79+5VMBiMyLS2tqq9vd3K1NXVyel0Kicnx85hAQAAAMAls/V2vm9+85uaOHGiKisrNWfOHO3bt09PPfWUnnrqKUkfvr3O5/OpsrJS6enpSk9PV2VlpUaMGKHi4mJJkmEYWrBggcrLyzVq1CiNHDlSFRUVys7O1rRp0yR9OLs1Y8YMlZSUaOPGjZKkhQsXqrCwUBkZGZKk/Px8ZWZmyuv1au3atTp9+rQqKipUUlLCnfYAAAAADBtbJeq+++7T9u3btWLFCn3ve99TWlqaHn/8cX31q1+1MkuXLlV3d7dKS0sVCASUm5ururo6JSQkWJkNGzYoOjpac+bMUXd3t6ZOnarNmzcrKirKymzdulVlZWXWXfyKiopUXV1trY+KitKOHTtUWlqqSZMmKS4uTsXFxVq3bt2QTwYAAAAAXIytz4m60di5Fzwi8TlRAAAAuJEM2+dEAQAAAMDNjhIFAAAAADZQogAAAADABkoUAAAAANhAiQIAAAAAGyhRAAAAAGADJQoAAAAAbKBEAQAAAIANlCgAAAAAsIESBQAAAAA2UKIAAAAAwAZKFAAAAADYQIkCAAAAABsoUQAAAABgAyUKAAAAAGygRAEAAACADZQoAAAAALCBEgUAAAAANlCiAAAAAMAGShQAAAAA2ECJAgAAAAAbKFEAAAAAYAMlCgAAAABsoEQBAAAAgA2UKAAAAACwgRIFAAAAADbYKlErV66Uw+GIeLjdbmu9aZpauXKlPB6P4uLiNGXKFB08eDBiH6FQSIsXL1ZycrLi4+NVVFSk48ePR2QCgYC8Xq8Mw5BhGPJ6vTpz5kxE5tixY5o9e7bi4+OVnJyssrIyhcNhm4cPAAAAAPbYnom666671N7ebj1ef/11a92aNWu0fv16VVdXa//+/XK73Zo+fbq6urqsjM/n0/bt21VTU6PGxkadPXtWhYWF6u3ttTLFxcVqaWlRbW2tamtr1dLSIq/Xa63v7e3VrFmzdO7cOTU2Nqqmpkbbtm1TeXn5UM8DAAAAAFySaNsbREdHzD59xDRNPf744/r2t7+tL37xi5KkZ599Vi6XSz//+c+1aNEiBYNBbdq0SVu2bNG0adMkSc8995xSU1P18ssvq6CgQIcOHVJtba2ampqUm5srSXr66aeVl5enI0eOKCMjQ3V1dXrjjTfU1tYmj8cjSfrhD3+o+fPn6wc/+IESExMHHHsoFFIoFLKed3Z22j18AAAAADc52zNRb731ljwej9LS0vTQQw/p7bffliQdPXpUfr9f+fn5VtbpdGry5MnavXu3JKm5uVk9PT0RGY/Ho6ysLCuzZ88eGYZhFShJmjBhggzDiMhkZWVZBUqSCgoKFAqF1NzcfMGxV1VVWW8RNAxDqampdg8fAAAAwE3OVonKzc3Vz372M7300kt6+umn5ff7NXHiRP3hD3+Q3++XJLlcrohtXC6Xtc7v9ys2NlZJSUmDZlJSUvq9dkpKSkSm7+skJSUpNjbWygxkxYoVCgaD1qOtrc3O4QMAAACAvbfzzZw50/o6OztbeXl5uuOOO/Tss89qwoQJkiSHwxGxjWma/Zb11TczUH4omb6cTqecTuegYwEAAACAwVzWLc7j4+OVnZ2tt956y/o7qb4zQR0dHdaskdvtVjgcViAQGDRz8uTJfq916tSpiEzf1wkEAurp6ek3QwUAAAAAV9JllahQKKRDhw5pzJgxSktLk9vtVn19vbU+HA6roaFBEydOlCTl5OQoJiYmItPe3q7W1lYrk5eXp2AwqH379lmZvXv3KhgMRmRaW1vV3t5uZerq6uR0OpWTk3M5hwQAAAAAg7L1dr6KigrNnj1bY8eOVUdHh77//e+rs7NT8+bNk8PhkM/nU2VlpdLT05Wenq7KykqNGDFCxcXFkiTDMLRgwQKVl5dr1KhRGjlypCoqKpSdnW3drW/cuHGaMWOGSkpKtHHjRknSwoULVVhYqIyMDElSfn6+MjMz5fV6tXbtWp0+fVoVFRUqKSm54J35AAAAAOBKsFWijh8/rq985St67733NHr0aE2YMEFNTU267bbbJElLly5Vd3e3SktLFQgElJubq7q6OiUkJFj72LBhg6KjozVnzhx1d3dr6tSp2rx5s6KioqzM1q1bVVZWZt3Fr6ioSNXV1db6qKgo7dixQ6WlpZo0aZLi4uJUXFysdevWXdbJAAAAAICLcZimaV7rQVwrnZ2dMgxDwWCQGSybbl++w/r6nVWzruFIAAAAgMtnpxtc1t9EAQAAAMDNhhIFAAAAADZQogAAAADABkoUAAAAANhAiQIAAAAAGyhRAAAAAGADJQoAAAAAbKBEAQAAAIANlCgAAAAAsIESBQAAAAA2UKIAAAAAwAZKFAAAAADYQIkCAAAAABsoUQAAAABgAyUKAAAAAGygRAEAAACADZQoAAAAALCBEgUAAAAANlCiAAAAAMAGShQAAAAA2ECJAgAAAAAbKFEAAAAAYAMlCgAAAABsoEQBAAAAgA2UKAAAAACwgRIFAAAAADZcVomqqqqSw+GQz+ezlpmmqZUrV8rj8SguLk5TpkzRwYMHI7YLhUJavHixkpOTFR8fr6KiIh0/fjwiEwgE5PV6ZRiGDMOQ1+vVmTNnIjLHjh3T7NmzFR8fr+TkZJWVlSkcDl/OIQEAAADAoIZcovbv36+nnnpKn/70pyOWr1mzRuvXr1d1dbX2798vt9ut6dOnq6ury8r4fD5t375dNTU1amxs1NmzZ1VYWKje3l4rU1xcrJaWFtXW1qq2tlYtLS3yer3W+t7eXs2aNUvnzp1TY2OjampqtG3bNpWXlw/1kAAAAADgohymaZp2Nzp79qzuvfdePfHEE/r+97+vz3zmM3r88cdlmqY8Ho98Pp+WLVsm6cNZJ5fLpdWrV2vRokUKBoMaPXq0tmzZorlz50qSTpw4odTUVO3cuVMFBQU6dOiQMjMz1dTUpNzcXElSU1OT8vLydPjwYWVkZGjXrl0qLCxUW1ubPB6PJKmmpkbz589XR0eHEhMT+407FAopFApZzzs7O5WamqpgMDhgHhd2+/Id1tfvrJp1DUcCAAAAXL7Ozk4ZhnFJ3WBIM1GPPvqoZs2apWnTpkUsP3r0qPx+v/Lz861lTqdTkydP1u7duyVJzc3N6unpich4PB5lZWVZmT179sgwDKtASdKECRNkGEZEJisryypQklRQUKBQKKTm5uYBx11VVWW9PdAwDKWmpg7l8AEAAADcxGyXqJqaGv32t79VVVVVv3V+v1+S5HK5Ipa7XC5rnd/vV2xsrJKSkgbNpKSk9Nt/SkpKRKbv6yQlJSk2NtbK9LVixQoFg0Hr0dbWdimHDAAAAACWaDvhtrY2feMb31BdXZ1uueWWC+YcDkfEc9M0+y3rq29moPxQMn/K6XTK6XQOOg4AAAAAGIytmajm5mZ1dHQoJydH0dHRio6OVkNDg370ox8pOjramhnqOxPU0dFhrXO73QqHwwoEAoNmTp482e/1T506FZHp+zqBQEA9PT39ZqgAAAAA4EqxVaKmTp2q119/XS0tLdZj/Pjx+upXv6qWlhZ96lOfktvtVn19vbVNOBxWQ0ODJk6cKEnKyclRTExMRKa9vV2tra1WJi8vT8FgUPv27bMye/fuVTAYjMi0traqvb3dytTV1cnpdConJ2cIpwIAAAAALs7W2/kSEhKUlZUVsSw+Pl6jRo2ylvt8PlVWVio9PV3p6emqrKzUiBEjVFxcLEkyDEMLFixQeXm5Ro0apZEjR6qiokLZ2dnWjSrGjRunGTNmqKSkRBs3bpQkLVy4UIWFhcrIyJAk5efnKzMzU16vV2vXrtXp06dVUVGhkpIS7rQHAAAAYNjYKlGXYunSperu7lZpaakCgYByc3NVV1enhIQEK7NhwwZFR0drzpw56u7u1tSpU7V582ZFRUVZma1bt6qsrMy6i19RUZGqq6ut9VFRUdqxY4dKS0s1adIkxcXFqbi4WOvWrbvShwQAAAAAliF9TtSNws694BGJz4kCAADAjWTYPycKAAAAAG5WlCgAAAAAsIESBQAAAAA2UKIAAAAAwAZKFAAAAADYQIkCAAAAABsoUQAAAABgAyUKAAAAAGygRAEAAACADZQoAAAAALCBEgUAAAAANlCiAAAAAMAGShQAAAAA2ECJAgAAAAAbKFEAAAAAYAMlCgAAAABsoEQBAAAAgA2UKAAAAACwgRIFAAAAADZQogAAAADABkoUAAAAANhAiQIAAAAAGyhRAAAAAGADJQoAAAAAbKBEAQAAAIANlCgAAAAAsMFWiXryySf16U9/WomJiUpMTFReXp527dplrTdNUytXrpTH41FcXJymTJmigwcPRuwjFApp8eLFSk5OVnx8vIqKinT8+PGITCAQkNfrlWEYMgxDXq9XZ86cicgcO3ZMs2fPVnx8vJKTk1VWVqZwOGzz8AEAAADAHlsl6tZbb9WqVav02muv6bXXXtPnPvc5feELX7CK0po1a7R+/XpVV1dr//79crvdmj59urq6uqx9+Hw+bd++XTU1NWpsbNTZs2dVWFio3t5eK1NcXKyWlhbV1taqtrZWLS0t8nq91vre3l7NmjVL586dU2Njo2pqarRt2zaVl5df7vkAAAAAgEE5TNM0L2cHI0eO1Nq1a/Xwww/L4/HI5/Np2bJlkj6cdXK5XFq9erUWLVqkYDCo0aNHa8uWLZo7d64k6cSJE0pNTdXOnTtVUFCgQ4cOKTMzU01NTcrNzZUkNTU1KS8vT4cPH1ZGRoZ27dqlwsJCtbW1yePxSJJqamo0f/58dXR0KDExccCxhkIhhUIh63lnZ6dSU1MVDAYvuA0GdvvyHdbX76yadQ1HAgAAAFy+zs5OGYZxSd1gyH8T1dvbq5qaGp07d055eXk6evSo/H6/8vPzrYzT6dTkyZO1e/duSVJzc7N6enoiMh6PR1lZWVZmz549MgzDKlCSNGHCBBmGEZHJysqyCpQkFRQUKBQKqbm5+YJjrqqqst4iaBiGUlNTh3r4AAAAAG5StkvU66+/rj/7sz+T0+nUI488ou3btyszM1N+v1+S5HK5IvIul8ta5/f7FRsbq6SkpEEzKSkp/V43JSUlItP3dZKSkhQbG2tlBrJixQoFg0Hr0dbWZvPoAQAAANzsou1ukJGRoZaWFp05c0bbtm3TvHnz1NDQYK13OBwRedM0+y3rq29moPxQMn05nU45nc5BxwIAAAAAg7E9ExUbG6u/+Iu/0Pjx41VVVaW7775b//RP/yS32y1J/WaCOjo6rFkjt9utcDisQCAwaObkyZP9XvfUqVMRmb6vEwgE1NPT02+GCgAAAACupMv+nCjTNBUKhZSWlia32636+nprXTgcVkNDgyZOnChJysnJUUxMTESmvb1dra2tViYvL0/BYFD79u2zMnv37lUwGIzItLa2qr293crU1dXJ6XQqJyfncg8JAAAAAC7I1tv5vvWtb2nmzJlKTU1VV1eXampq9Oqrr6q2tlYOh0M+n0+VlZVKT09Xenq6KisrNWLECBUXF0uSDMPQggULVF5erlGjRmnkyJGqqKhQdna2pk2bJkkaN26cZsyYoZKSEm3cuFGStHDhQhUWFiojI0OSlJ+fr8zMTHm9Xq1du1anT59WRUWFSkpKuMseAAAAgGFlq0SdPHlSXq9X7e3tMgxDn/70p1VbW6vp06dLkpYuXaru7m6VlpYqEAgoNzdXdXV1SkhIsPaxYcMGRUdHa86cOeru7tbUqVO1efNmRUVFWZmtW7eqrKzMuotfUVGRqqurrfVRUVHasWOHSktLNWnSJMXFxam4uFjr1q27rJMBAAAAABdz2Z8TdT2zcy94ROJzogAAAHAjuSqfEwUAAAAANyNKFAAAAADYQIkCAAAAABsoUQAAAABgAyUKAAAAAGygRAEAAACADZQoAAAAALCBEgUAAAAANlCiAAAAAMAGShQAAAAA2ECJAgAAAAAbKFEAAAAAYAMlCgAAAABsoEQBAAAAgA2UKAAAAACwgRIFAAAAADZQogAAAADABkoUAAAAANhAiQIAAAAAGyhRAAAAAGADJQoAAAAAbKBEAQAAAIANlCgAAAAAsIESBQAAAAA2UKIAAAAAwAZKFAAAAADYYKtEVVVV6b777lNCQoJSUlL04IMP6siRIxEZ0zS1cuVKeTwexcXFacqUKTp48GBEJhQKafHixUpOTlZ8fLyKiop0/PjxiEwgEJDX65VhGDIMQ16vV2fOnInIHDt2TLNnz1Z8fLySk5NVVlamcDhs55AAAAAAwBZbJaqhoUGPPvqompqaVF9frw8++ED5+fk6d+6clVmzZo3Wr1+v6upq7d+/X263W9OnT1dXV5eV8fl82r59u2pqatTY2KizZ8+qsLBQvb29Vqa4uFgtLS2qra1VbW2tWlpa5PV6rfW9vb2aNWuWzp07p8bGRtXU1Gjbtm0qLy+/nPMBAAAAAINymKZpDnXjU6dOKSUlRQ0NDXrggQdkmqY8Ho98Pp+WLVsm6cNZJ5fLpdWrV2vRokUKBoMaPXq0tmzZorlz50qSTpw4odTUVO3cuVMFBQU6dOiQMjMz1dTUpNzcXElSU1OT8vLydPjwYWVkZGjXrl0qLCxUW1ubPB6PJKmmpkbz589XR0eHEhMT+403FAopFApZzzs7O5WamqpgMDhgHhd2+/Id1tfvrJp1DUcCAAAAXL7Ozk4ZhnFJ3eCy/iYqGAxKkkaOHClJOnr0qPx+v/Lz862M0+nU5MmTtXv3bklSc3Ozenp6IjIej0dZWVlWZs+ePTIMwypQkjRhwgQZhhGRycrKsgqUJBUUFCgUCqm5uXnA8VZVVVlvDzQMQ6mpqZdz+AAAAABuQkMuUaZpasmSJbr//vuVlZUlSfL7/ZIkl8sVkXW5XNY6v9+v2NhYJSUlDZpJSUnp95opKSkRmb6vk5SUpNjYWCvT14oVKxQMBq1HW1ub3cMGAAAAcJOLHuqGjz32mH7/+9+rsbGx3zqHwxHx3DTNfsv66psZKD+UzJ9yOp1yOp2DjgMAAAAABjOkmajFixfrxRdf1CuvvKJbb73VWu52uyWp30xQR0eHNWvkdrsVDocVCAQGzZw8ebLf6546dSoi0/d1AoGAenp6+s1QAQAAAMCVYqtEmaapxx57TM8//7x++ctfKi0tLWJ9Wlqa3G636uvrrWXhcFgNDQ2aOHGiJCknJ0cxMTERmfb2drW2tlqZvLw8BYNB7du3z8rs3btXwWAwItPa2qr29nYrU1dXJ6fTqZycHDuHBQAAAACXzNbb+R599FH9/Oc/13/8x38oISHBmgkyDENxcXFyOBzy+XyqrKxUenq60tPTVVlZqREjRqi4uNjKLliwQOXl5Ro1apRGjhypiooKZWdna9q0aZKkcePGacaMGSopKdHGjRslSQsXLlRhYaEyMjIkSfn5+crMzJTX69XatWt1+vRpVVRUqKSkhDvtAQAAABg2tkrUk08+KUmaMmVKxPJnnnlG8+fPlyQtXbpU3d3dKi0tVSAQUG5ururq6pSQkGDlN2zYoOjoaM2ZM0fd3d2aOnWqNm/erKioKCuzdetWlZWVWXfxKyoqUnV1tbU+KipKO3bsUGlpqSZNmqS4uDgVFxdr3bp1tk4AAAAAANhxWZ8Tdb2zcy94ROJzogAAAHAjuWqfEwUAAAAANxtKFAAAAADYQIkCAAAAABsoUQAAAABgAyUKAAAAAGygRAEAAACADZQoAAAAALCBEgUAAAAANlCiAAAAAMAGShQAAAAA2ECJAgAAAAAbKFEAAAAAYAMlCgAAAABsoEQBAAAAgA2UKAAAAACwgRIFAAAAADZQogAAAADABkoUAAAAANhAiQIAAAAAGyhRAAAAAGADJQoAAAAAbKBEAQAAAIANlCgAAAAAsIESBQAAAAA2UKIAAAAAwAZKFAAAAADYYLtE/epXv9Ls2bPl8XjkcDj0wgsvRKw3TVMrV66Ux+NRXFycpkyZooMHD0ZkQqGQFi9erOTkZMXHx6uoqEjHjx+PyAQCAXm9XhmGIcMw5PV6debMmYjMsWPHNHv2bMXHxys5OVllZWUKh8N2DwkAAAAALpntEnXu3Dndfffdqq6uHnD9mjVrtH79elVXV2v//v1yu92aPn26urq6rIzP59P27dtVU1OjxsZGnT17VoWFhert7bUyxcXFamlpUW1trWpra9XS0iKv12ut7+3t1axZs3Tu3Dk1NjaqpqZG27ZtU3l5ud1DAgAAAIBL5jBN0xzyxg6Htm/frgcffFDSh7NQHo9HPp9Py5Ytk/ThrJPL5dLq1au1aNEiBYNBjR49Wlu2bNHcuXMlSSdOnFBqaqp27typgoICHTp0SJmZmWpqalJubq4kqampSXl5eTp8+LAyMjK0a9cuFRYWqq2tTR6PR5JUU1Oj+fPnq6OjQ4mJif3GGwqFFAqFrOednZ1KTU1VMBgcMI8Lu335Duvrd1bNuoYjAQAAAC5fZ2enDMO4pG5wRf8m6ujRo/L7/crPz7eWOZ1OTZ48Wbt375YkNTc3q6enJyLj8XiUlZVlZfbs2SPDMKwCJUkTJkyQYRgRmaysLKtASVJBQYFCoZCam5sHHF9VVZX19kDDMJSamnrlDh4AAADATeGKlii/3y9JcrlcEctdLpe1zu/3KzY2VklJSYNmUlJS+u0/JSUlItP3dZKSkhQbG2tl+lqxYoWCwaD1aGtrG8JRAgAAALiZRQ/HTh0OR8Rz0zT7Leurb2ag/FAyf8rpdMrpdA46DgAAAAAYzBWdiXK73ZLUbyaoo6PDmjVyu90Kh8MKBAKDZk6ePNlv/6dOnYrI9H2dQCCgnp6efjNUAAAAAHClXNESlZaWJrfbrfr6emtZOBxWQ0ODJk6cKEnKyclRTExMRKa9vV2tra1WJi8vT8FgUPv27bMye/fuVTAYjMi0traqvb3dytTV1cnpdConJ+dKHhYAAAAAWGy/ne/s2bP67//+b+v50aNH1dLSopEjR2rs2LHy+XyqrKxUenq60tPTVVlZqREjRqi4uFiSZBiGFixYoPLyco0aNUojR45URUWFsrOzNW3aNEnSuHHjNGPGDJWUlGjjxo2SpIULF6qwsFAZGRmSpPz8fGVmZsrr9Wrt2rU6ffq0KioqVFJSwp32AAAAAAwb2yXqtdde01//9V9bz5csWSJJmjdvnjZv3qylS5equ7tbpaWlCgQCys3NVV1dnRISEqxtNmzYoOjoaM2ZM0fd3d2aOnWqNm/erKioKCuzdetWlZWVWXfxKyoqivhsqqioKO3YsUOlpaWaNGmS4uLiVFxcrHXr1tk/CwAAAABwiS7rc6Kud3buBY9IfE4UAAAAbiTX7HOiAAAAAOBGR4kCAAAAABsoUQAAAABgAyUKAAAAAGygRAEAAACADZQoAAAAALCBEgUAAAAANlCiAAAAAMAGShQAAAAA2ECJAgAAAAAbKFEAAAAAYAMlCgAAAABsoEQBAAAAgA2UKAAAAACwgRIFAAAAADZQogAAAADABkoUAAAAANhAiQIAAAAAG6Kv9QAAAABwddy+fEfE83dWzbpGIwGub8xEAQAAAIANzEThsvFbLQAArow//ZnKz1Pg44sSBQAAPraGu1Twi0AAQ0GJwhV3JX7g8UMNAPBx1/dn1Z/6059bQ/2ZNth2/JwEri1KFIbVYD9gAACw42oUhys18zXYz7+h/mwcjn0CGBpKFAAAuOFc6izRUPdxo+BvsIChoUQBACTx9qDhcKP/A/Vav91sOGZ0bmZ2zsuNeD0Ddlz3JeqJJ57Q2rVr1d7errvuukuPP/64PvvZz17rYeEKu9H/IYILu9T/9lfit84fd1f7GPn/7uNvOP7W5kpmL3UdAFxvHKZpmtd6EEP1i1/8Ql6vV0888YQmTZqkjRs36l/+5V/0xhtvaOzYsRfdvrOzU4ZhKBgMKjEx8SqM+MZxM/0wvNg/Sob6D9tLPYdX4x+vN9N/T1wZg804DMc+h2OG41q8xqXuc7ACy/+vuJ7xCxl8nNnpBtd1icrNzdW9996rJ5980lo2btw4Pfjgg6qqquqXD4VCCoVC1vNgMKixY8eqra2NEmVT1ndfutZDAAAAN6nWfyy41kPADaizs1Opqak6c+aMDMMYNHvdvp0vHA6rublZy5cvj1ien5+v3bt3D7hNVVWV/vEf/7Hf8tTU1GEZIwAAAK484/FrPQLcyLq6um7cEvXee++pt7dXLpcrYrnL5ZLf7x9wmxUrVmjJkiXW8/Pnz+v06dMaNWqUHA7HsI73RvRRW2cmb/hwjq8OzvPw4xxfHZzn4cc5vjo4z8OPc9yfaZrq6uqSx+O5aPa6LVEf6Vt+TNO8YCFyOp1yOp0Ryz75yU8O19BuGomJifzPN8w4x1cH53n4cY6vDs7z8OMcXx2c5+HHOY50sRmoj3ximMcxbJKTkxUVFdVv1qmjo6Pf7BQAAAAAXCnXbYmKjY1VTk6O6uvrI5bX19dr4sSJ12hUAAAAAG501/Xb+ZYsWSKv16vx48crLy9PTz31lI4dO6ZHHnnkWg/tpuB0OvXd736331skceVwjq8OzvPw4xxfHZzn4cc5vjo4z8OPc3x5rutbnEsfftjumjVr1N7erqysLG3YsEEPPPDAtR4WAAAAgBvUdV+iAAAAAOBqum7/JgoAAAAArgVKFAAAAADYQIkCAAAAABsoUQAAAABgAyUKF/TEE08oLS1Nt9xyi3JycvTrX/960HxDQ4NycnJ0yy236FOf+pR+8pOfXKWRXp+qqqp03333KSEhQSkpKXrwwQd15MiRQbd59dVX5XA4+j0OHz58lUZ9/Vm5cmW/8+V2uwfdhmvZnttvv33A6/LRRx8dMM91fGl+9atfafbs2fJ4PHI4HHrhhRci1pumqZUrV8rj8SguLk5TpkzRwYMHL7rfbdu2KTMzU06nU5mZmdq+ffswHcHH32DnuKenR8uWLVN2drbi4+Pl8Xj0ta99TSdOnBh0n5s3bx7w+v7jH/84zEfz8XWxa3n+/Pn9zteECRMuul+u5f9zsXM80DXpcDi0du3aC+6Ta3lwlCgM6Be/+IV8Pp++/e1v68CBA/rsZz+rmTNn6tixYwPmjx49qs9//vP67Gc/qwMHDuhb3/qWysrKtG3btqs88utHQ0ODHn30UTU1Nam+vl4ffPCB8vPzde7cuYtue+TIEbW3t1uP9PT0qzDi69ddd90Vcb5ef/31C2a5lu3bv39/xPn96EPQv/zlLw+6Hdfx4M6dO6e7775b1dXVA65fs2aN1q9fr+rqau3fv19ut1vTp09XV1fXBfe5Z88ezZ07V16vV7/73e/k9Xo1Z84c7d27d7gO42NtsHP8/vvv67e//a2+853v6Le//a2ef/55vfnmmyoqKrrofhMTEyOu7fb2dt1yyy3DcQjXhYtdy5I0Y8aMiPO1c+fOQffJtRzpYue47/X405/+VA6HQ1/60pcG3S/X8iBMYAB/9Vd/ZT7yyCMRy+68805z+fLlA+aXLl1q3nnnnRHLFi1aZE6YMGHYxnij6ejoMCWZDQ0NF8y88sorpiQzEAhcvYFd57773e+ad9999yXnuZYv3ze+8Q3zjjvuMM+fPz/geq5j+ySZ27dvt56fP3/edLvd5qpVq6xlf/zjH03DMMyf/OQnF9zPnDlzzBkzZkQsKygoMB966KErPubrTd9zPJB9+/aZksx33333gplnnnnGNAzjyg7uBjLQeZ43b575hS98wdZ+uJYv7FKu5S984Qvm5z73uUEzXMuDYyYK/YTDYTU3Nys/Pz9ieX5+vnbv3j3gNnv27OmXLygo0Guvvaaenp5hG+uNJBgMSpJGjhx50ew999yjMWPGaOrUqXrllVeGe2jXvbfeeksej0dpaWl66KGH9Pbbb18wy7V8ecLhsJ577jk9/PDDcjgcg2a5jofu6NGj8vv9Edeq0+nU5MmTL/h9Wrrw9T3YNvg/wWBQDodDn/zkJwfNnT17VrfddptuvfVWFRYW6sCBA1dngNexV199VSkpKfrLv/xLlZSUqKOjY9A81/LQnTx5Ujt27NCCBQsumuVavjBKFPp577331NvbK5fLFbHc5XLJ7/cPuI3f7x8w/8EHH+i9994btrHeKEzT1JIlS3T//fcrKyvrgrkxY8boqaee0rZt2/T8888rIyNDU6dO1a9+9aurONrrS25urn72s5/ppZde0tNPPy2/36+JEyfqD3/4w4B5ruXL88ILL+jMmTOaP3/+BTNcx5fvo+/Fdr5Pf7Sd3W3woT/+8Y9avny5iouLlZiYeMHcnXfeqc2bN+vFF1/Uv/7rv+qWW27RpEmT9NZbb13F0V5fZs6cqa1bt+qXv/ylfvjDH2r//v363Oc+p1AodMFtuJaH7tlnn1VCQoK++MUvDprjWh5c9LUeAD6++v4W2TTNQX+zPFB+oOXo77HHHtPvf/97NTY2DprLyMhQRkaG9TwvL09tbW1at26dHnjggeEe5nVp5syZ1tfZ2dnKy8vTHXfcoWeffVZLliwZcBuu5aHbtGmTZs6cKY/Hc8EM1/GVY/f79FC3udn19PTooYce0vnz5/XEE08Mmp0wYULETREmTZqke++9Vz/+8Y/1ox/9aLiHel2aO3eu9XVWVpbGjx+v2267TTt27Bj0H/pcy0Pz05/+VF/96lcv+rdNXMuDYyYK/SQnJysqKqrfb3M6Ojr6/dbnI263e8B8dHS0Ro0aNWxjvREsXrxYL774ol555RXdeuuttrefMGECvxWyIT4+XtnZ2Rc8Z1zLQ/fuu+/q5Zdf1te//nXb23Id2/PRHSbtfJ/+aDu729zsenp6NGfOHB09elT19fWDzkIN5BOf+ITuu+8+rm8bxowZo9tuu23Qc8a1PDS//vWvdeTIkSF9n+ZajkSJQj+xsbHKycmx7rD1kfr6ek2cOHHAbfLy8vrl6+rqNH78eMXExAzbWK9npmnqscce0/PPP69f/vKXSktLG9J+Dhw4oDFjxlzh0d24QqGQDh06dMFzxrU8dM8884xSUlI0a9Ys29tyHduTlpYmt9sdca2Gw2E1NDRc8Pu0dOHre7BtbmYfFai33npLL7/88pB+kWKaplpaWri+bfjDH/6gtra2Qc8Z1/LQbNq0STk5Obr77rttb8u13Me1uqMFPt5qamrMmJgYc9OmTeYbb7xh+nw+Mz4+3nznnXdM0zTN5cuXm16v18q//fbb5ogRI8xvfvOb5htvvGFu2rTJjImJMf/93//9Wh3Cx97f/d3fmYZhmK+++qrZ3t5uPd5//30r0/c8b9iwwdy+fbv55ptvmq2treby5ctNSea2bduuxSFcF8rLy81XX33VfPvtt82mpiazsLDQTEhI4Fq+wnp7e82xY8eay5Yt67eO63hourq6zAMHDpgHDhwwJZnr1683Dxw4YN0ZbtWqVaZhGObzzz9vvv766+ZXvvIVc8yYMWZnZ6e1D6/XG3FX1d/85jdmVFSUuWrVKvPQoUPmqlWrzOjoaLOpqemqH9/HwWDnuKenxywqKjJvvfVWs6WlJeL7dCgUsvbR9xyvXLnSrK2tNf/nf/7HPHDggPm3f/u3ZnR0tLl3795rcYgfC4Od566uLrO8vNzcvXu3efToUfOVV14x8/LyzD//8z/nWrbhYt8vTNM0g8GgOWLECPPJJ58ccB9cy/ZQonBB//zP/2zedtttZmxsrHnvvfdG3Hp73rx55uTJkyPyr776qnnPPfeYsbGx5u23337B/0nxIUkDPp555hkr0/c8r1692rzjjjvMW265xUxKSjLvv/9+c8eOHVd/8NeRuXPnmmPGjDFjYmJMj8djfvGLXzQPHjxoredavjJeeuklU5J55MiRfuu4jofmo1vB933MmzfPNM0Pb3P+3e9+13S73abT6TQfeOAB8/XXX4/Yx+TJk638R/7t3/7NzMjIMGNiYsw777zzpi6vg53jo0ePXvD79CuvvGLto+859vl85tixY83Y2Fhz9OjRZn5+vrl79+6rf3AfI4Od5/fff9/Mz883R48ebcbExJhjx441582bZx47dixiH1zLg7vY9wvTNM2NGzeacXFx5pkzZwbcB9eyPQ7T/P//YhoAAAAAcFH8TRQAAAAA2ECJAgAAAAAbKFEAAAAAYAMlCgAAAABsoEQBAAAAgA2UKAAAAACwgRIFAAAAADZQogAAAADABkoUAAAAANhAiQIAAAAAGyhRAAAAAGDD/weZDTD8mZiVvAAAAABJRU5ErkJggg==", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plt.figure(figsize=(10,4))\n", "\n", "plt.hist(np.log1p(target['target']), bins = 200);" ] }, { "cell_type": "markdown", "id": "a0b88862-bd0a-4493-9342-38f20407bf78", "metadata": {}, "source": [ "В распределении очень много нулей, так что стоит смотреть чуть уже:" ] }, { "cell_type": "code", "execution_count": 12, "id": "a7da6bae-7c27-4cc3-ba62-82d3a606a15f", "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAA0gAAAFfCAYAAAB0q+zRAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjAsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvlHJYcgAAAAlwSFlzAAAPYQAAD2EBqD+naQAANAtJREFUeJzt3X9UVPed//HXVGBUCrcCgWFWNDY1xBTiSTELY9No/IGyISQ1Z7WlZ1ZbV5PGaFn1pJqcPSHndMXajaaVjbWuVSOm5OwmpumaToJHJXURf1DZqrHUbjXFE0ZMFgY1dCDmfv/o9X4z/NJBkF/Pxzn3HO6973v53OvFmdd87v2MwzRNUwAAAAAAfa6vGwAAAAAA/QUBCQAAAAAsBCQAAAAAsBCQAAAAAMBCQAIAAAAACwEJAAAAACwEJAAAAACwRPR1A3rLp59+qg8++EAxMTFyOBx93RwAAAAAfcQ0TV26dElut1uf+1zXfUSDNiB98MEHSklJ6etmAAAAAOgnamtrNXr06C5rBm1AiomJkfTXkxAbG9vHrQEAAADQV5qampSSkmJnhK4M2oB07ba62NhYAhIAAACAG3r0hkEaAAAAAMBCQAIAAAAACwEJAAAAACwEJAAAAACwEJAAAAAAwEJAAgAAAAALAQkAAAAALAQkAAAAALAQkAAAAADAQkACAAAAAAsBCQAAAAAsBCQAAAAAsET0dQOGittX7QmZP7f2oT5qCQAAAIDO0IMEAAAAABYCEgAAAABYbiogFRUVyeFwqKCgwF5mmqYKCwvldrs1YsQITZ06VadOnQrZLhgMaunSpUpISFB0dLTy8vJ0/vz5kJqGhgZ5vV4ZhiHDMOT1etXY2HgzzQUAAACALnU7IB09elQ/+9nPdM8994QsX7dundavX6/i4mIdPXpULpdLM2fO1KVLl+yagoIC7d69W6WlpTp48KAuX76s3NxcXb161a7Jz89XdXW1fD6ffD6fqqur5fV6u9tcAAAAALiubgWky5cv61vf+pa2bNmiUaNG2ctN09SLL76oZ599VnPmzFFaWpp27Nihjz/+WK+88ookKRAIaOvWrXrhhRc0Y8YM3XvvvSopKdGJEye0d+9eSdLp06fl8/n07//+7/J4PPJ4PNqyZYv+67/+SzU1NT1w2AAAAADQXrcC0pIlS/TQQw9pxowZIcvPnj0rv9+v7Oxse5nT6dSUKVNUUVEhSaqqqlJra2tIjdvtVlpaml1z6NAhGYahzMxMuyYrK0uGYdg1bQWDQTU1NYVMAAAAABCOsIf5Li0t1W9/+1sdPXq03Tq/3y9JSkpKClmelJSk999/366JiooK6Xm6VnNte7/fr8TExHb7T0xMtGvaKioq0vPPPx/u4QAAAACALawepNraWn3ve99TSUmJhg8f3mmdw+EImTdNs92yttrWdFTf1X5Wr16tQCBgT7W1tV3+PgAAAABoK6yAVFVVpfr6emVkZCgiIkIREREqLy/XT37yE0VERNg9R217eerr6+11LpdLLS0tamho6LLmwoUL7X7/xYsX2/VOXeN0OhUbGxsyAQAAAEA4wgpI06dP14kTJ1RdXW1PkyZN0re+9S1VV1fri1/8olwul8rKyuxtWlpaVF5ersmTJ0uSMjIyFBkZGVJTV1enkydP2jUej0eBQEBHjhyxaw4fPqxAIGDXAAAAAEBPC+sZpJiYGKWlpYUsi46OVnx8vL28oKBAa9as0fjx4zV+/HitWbNGI0eOVH5+viTJMAwtXLhQK1asUHx8vOLi4rRy5Uqlp6fbgz5MmDBBs2fP1qJFi7R582ZJ0uLFi5Wbm6vU1NSbPmgAAAAA6EjYgzRcz9NPP63m5mY9+eSTamhoUGZmpt555x3FxMTYNRs2bFBERITmzp2r5uZmTZ8+Xdu3b9ewYcPsml27dmnZsmX2aHd5eXkqLi7u6eYCAAAAgM1hmqbZ143oDU1NTTIMQ4FAoF88j3T7qj0h8+fWPtRHLQEAAACGlnCyQbe+BwkAAAAABiMCEgAAAABYCEgAAAAAYCEgAQAAAICFgAQAAAAAFgISAAAAAFgISAAAAABgISABAAAAgIWABAAAAAAWAhIAAAAAWAhIAAAAAGAhIAEAAACAhYAEAAAAABYCEgAAAABYCEgAAAAAYCEgAQAAAICFgAQAAAAAFgISAAAAAFgISAAAAABgISABAAAAgIWABAAAAAAWAhIAAAAAWAhIAAAAAGAhIAEAAACAhYAEAAAAAJawAtKmTZt0zz33KDY2VrGxsfJ4PPr1r39tr1+wYIEcDkfIlJWVFbKPYDCopUuXKiEhQdHR0crLy9P58+dDahoaGuT1emUYhgzDkNfrVWNjY/ePEgAAAABuQFgBafTo0Vq7dq2OHTumY8eOadq0aXrkkUd06tQpu2b27Nmqq6uzp7feeitkHwUFBdq9e7dKS0t18OBBXb58Wbm5ubp69apdk5+fr+rqavl8Pvl8PlVXV8vr9d7koQIAAABA1yLCKX744YdD5v/lX/5FmzZtUmVlpb785S9LkpxOp1wuV4fbBwIBbd26VTt37tSMGTMkSSUlJUpJSdHevXs1a9YsnT59Wj6fT5WVlcrMzJQkbdmyRR6PRzU1NUpNTe1w38FgUMFg0J5vamoK59AAAAAAoPvPIF29elWlpaW6cuWKPB6PvfzAgQNKTEzUnXfeqUWLFqm+vt5eV1VVpdbWVmVnZ9vL3G630tLSVFFRIUk6dOiQDMOww5EkZWVlyTAMu6YjRUVF9i15hmEoJSWlu4cGAAAAYIgKOyCdOHFCn//85+V0OvXEE09o9+7duvvuuyVJOTk52rVrl/bt26cXXnhBR48e1bRp0+yeHb/fr6ioKI0aNSpkn0lJSfL7/XZNYmJiu9+bmJho13Rk9erVCgQC9lRbWxvuoQEAAAAY4sK6xU6SUlNTVV1drcbGRr322muaP3++ysvLdffdd2vevHl2XVpamiZNmqSxY8dqz549mjNnTqf7NE1TDofDnv/sz53VtOV0OuV0OsM9HAAAAACwhd2DFBUVpS996UuaNGmSioqKNHHiRP34xz/usDY5OVljx47VmTNnJEkul0stLS1qaGgIqauvr1dSUpJdc+HChXb7unjxol0DAAAAAL3hpr8HyTTNkMERPuujjz5SbW2tkpOTJUkZGRmKjIxUWVmZXVNXV6eTJ09q8uTJkiSPx6NAIKAjR47YNYcPH1YgELBrAAAAAKA3hHWL3TPPPKOcnBylpKTo0qVLKi0t1YEDB+Tz+XT58mUVFhbqscceU3Jyss6dO6dnnnlGCQkJ+vrXvy5JMgxDCxcu1IoVKxQfH6+4uDitXLlS6enp9qh2EyZM0OzZs7Vo0SJt3rxZkrR48WLl5uZ2OoIdAAAAAPSEsALShQsX5PV6VVdXJ8MwdM8998jn82nmzJlqbm7WiRMn9PLLL6uxsVHJycl68MEH9eqrryomJsbex4YNGxQREaG5c+equblZ06dP1/bt2zVs2DC7ZteuXVq2bJk92l1eXp6Ki4t76JABAAAAoGMO0zTNvm5Eb2hqapJhGAoEAoqNje3r5uj2VXtC5s+tfaiPWgIAAAAMLeFkg5t+BgkAAAAABgsCEgAAAABYCEgAAAAAYCEgAQAAAICFgAQAAAAAFgISAAAAAFgISAAAAABgISABAAAAgIWABAAAAAAWAhIAAAAAWAhIAAAAAGAhIAEAAACAhYAEAAAAABYCEgAAAABYCEgAAAAAYCEgAQAAAICFgAQAAAAAFgISAAAAAFgISAAAAABgISABAAAAgIWABAAAAAAWAhIAAAAAWAhIAAAAAGAhIAEAAACAJayAtGnTJt1zzz2KjY1VbGysPB6Pfv3rX9vrTdNUYWGh3G63RowYoalTp+rUqVMh+wgGg1q6dKkSEhIUHR2tvLw8nT9/PqSmoaFBXq9XhmHIMAx5vV41NjZ2/ygBAAAA4AaEFZBGjx6ttWvX6tixYzp27JimTZumRx55xA5B69at0/r161VcXKyjR4/K5XJp5syZunTpkr2PgoIC7d69W6WlpTp48KAuX76s3NxcXb161a7Jz89XdXW1fD6ffD6fqqur5fV6e+iQAQAAAKBjDtM0zZvZQVxcnH70ox/pO9/5jtxutwoKCvT9739f0l97i5KSkvTDH/5Qjz/+uAKBgG677Tbt3LlT8+bNkyR98MEHSklJ0VtvvaVZs2bp9OnTuvvuu1VZWanMzExJUmVlpTwej37/+98rNTX1htrV1NQkwzAUCAQUGxt7M4fYI25ftSdk/tzah/qoJQAAAMDQEk426PYzSFevXlVpaamuXLkij8ejs2fPyu/3Kzs7265xOp2aMmWKKioqJElVVVVqbW0NqXG73UpLS7NrDh06JMMw7HAkSVlZWTIMw67pSDAYVFNTU8gEAAAAAOEIOyCdOHFCn//85+V0OvXEE09o9+7duvvuu+X3+yVJSUlJIfVJSUn2Or/fr6ioKI0aNarLmsTExHa/NzEx0a7pSFFRkf3MkmEYSklJCffQAAAAAAxxYQek1NRUVVdXq7KyUt/97nc1f/58vffee/Z6h8MRUm+aZrtlbbWt6aj+evtZvXq1AoGAPdXW1t7oIQEAAACApG4EpKioKH3pS1/SpEmTVFRUpIkTJ+rHP/6xXC6XJLXr5amvr7d7lVwul1paWtTQ0NBlzYULF9r93osXL7brnfosp9Npj653bQIAAACAcNz09yCZpqlgMKhx48bJ5XKprKzMXtfS0qLy8nJNnjxZkpSRkaHIyMiQmrq6Op08edKu8Xg8CgQCOnLkiF1z+PBhBQIBuwYAAAAAekNEOMXPPPOMcnJylJKSokuXLqm0tFQHDhyQz+eTw+FQQUGB1qxZo/Hjx2v8+PFas2aNRo4cqfz8fEmSYRhauHChVqxYofj4eMXFxWnlypVKT0/XjBkzJEkTJkzQ7NmztWjRIm3evFmStHjxYuXm5t7wCHYAAAAA0B1hBaQLFy7I6/Wqrq5OhmHonnvukc/n08yZMyVJTz/9tJqbm/Xkk0+qoaFBmZmZeueddxQTE2PvY8OGDYqIiNDcuXPV3Nys6dOna/v27Ro2bJhds2vXLi1btswe7S4vL0/FxcU9cbwAAAAA0Kmb/h6k/orvQQIAAAAg3aLvQQIAAACAwYaABAAAAAAWAhIAAAAAWAhIAAAAAGAhIAEAAACAhYAEAAAAABYCEgAAAABYCEgAAAAAYCEgAQAAAICFgAQAAAAAFgISAAAAAFgi+roBAAAA+P9uX7UnZP7c2of6qCXA0EQPEgAAAABYCEgAAAAAYCEgAQAAAICFgAQAAAAAFgISAAAAAFgISAAAAABgISABAAAAgIWABAAAAAAWAhIAAAAAWAhIAAAAAGAhIAEAAACAhYAEAAAAABYCEgAAAABYwgpIRUVFuu+++xQTE6PExEQ9+uijqqmpCalZsGCBHA5HyJSVlRVSEwwGtXTpUiUkJCg6Olp5eXk6f/58SE1DQ4O8Xq8Mw5BhGPJ6vWpsbOzeUQIAAADADQgrIJWXl2vJkiWqrKxUWVmZPvnkE2VnZ+vKlSshdbNnz1ZdXZ09vfXWWyHrCwoKtHv3bpWWlurgwYO6fPmycnNzdfXqVbsmPz9f1dXV8vl88vl8qq6ultfrvYlDBQAAAICuRYRT7PP5Qua3bdumxMREVVVV6YEHHrCXO51OuVyuDvcRCAS0detW7dy5UzNmzJAklZSUKCUlRXv37tWsWbN0+vRp+Xw+VVZWKjMzU5K0ZcsWeTwe1dTUKDU1NayDBAAAAIAbcVPPIAUCAUlSXFxcyPIDBw4oMTFRd955pxYtWqT6+np7XVVVlVpbW5WdnW0vc7vdSktLU0VFhSTp0KFDMgzDDkeSlJWVJcMw7Jq2gsGgmpqaQiYAAAAACEe3A5Jpmlq+fLnuv/9+paWl2ctzcnK0a9cu7du3Ty+88IKOHj2qadOmKRgMSpL8fr+ioqI0atSokP0lJSXJ7/fbNYmJie1+Z2Jiol3TVlFRkf28kmEYSklJ6e6hAQAAABiiwrrF7rOeeuop/e53v9PBgwdDls+bN8/+OS0tTZMmTdLYsWO1Z88ezZkzp9P9maYph8Nhz3/2585qPmv16tVavny5Pd/U1ERIAgAAABCWbvUgLV26VG+++ab279+v0aNHd1mbnJyssWPH6syZM5Ikl8ullpYWNTQ0hNTV19crKSnJrrlw4UK7fV28eNGuacvpdCo2NjZkAgAAAIBwhBWQTNPUU089pddff1379u3TuHHjrrvNRx99pNraWiUnJ0uSMjIyFBkZqbKyMrumrq5OJ0+e1OTJkyVJHo9HgUBAR44csWsOHz6sQCBg1wAAAABATwvrFrslS5bolVde0S9/+UvFxMTYzwMZhqERI0bo8uXLKiws1GOPPabk5GSdO3dOzzzzjBISEvT1r3/drl24cKFWrFih+Ph4xcXFaeXKlUpPT7dHtZswYYJmz56tRYsWafPmzZKkxYsXKzc3lxHsAAAAAPSasALSpk2bJElTp04NWb5t2zYtWLBAw4YN04kTJ/Tyyy+rsbFRycnJevDBB/Xqq68qJibGrt+wYYMiIiI0d+5cNTc3a/r06dq+fbuGDRtm1+zatUvLli2zR7vLy8tTcXFxd48TAAAAAK4rrIBkmmaX60eMGKG33377uvsZPny4Nm7cqI0bN3ZaExcXp5KSknCaBwAAAAA35aa+BwkAAAAABhMCEgAAAABYCEgAAAAAYCEgAQAAAIAlrEEaAAAAhrrbV+2xfz639qE+bAmA3kAPEgAAAABYCEgAAAAAYCEgAQAAAICFgAQAAAAAFgISAAAAAFgISAAAAABgISABAAAAgIWABAAAAAAWvigWAAD0a5/9YlaJL2cF0LvoQQIAAAAACwEJAAAAACwEJAAAAACwEJAAAAAAwEJAAgAAAAALAQkAAAAALAQkAAAAALAQkAAAAADAQkACAAAAAAsBCQAAAAAsBCQAAAAAsIQVkIqKinTfffcpJiZGiYmJevTRR1VTUxNSY5qmCgsL5Xa7NWLECE2dOlWnTp0KqQkGg1q6dKkSEhIUHR2tvLw8nT9/PqSmoaFBXq9XhmHIMAx5vV41NjZ27ygBAAAA4AaEFZDKy8u1ZMkSVVZWqqysTJ988omys7N15coVu2bdunVav369iouLdfToUblcLs2cOVOXLl2yawoKCrR7926Vlpbq4MGDunz5snJzc3X16lW7Jj8/X9XV1fL5fPL5fKqurpbX6+2BQwYAAACAjkWEU+zz+ULmt23bpsTERFVVVemBBx6QaZp68cUX9eyzz2rOnDmSpB07digpKUmvvPKKHn/8cQUCAW3dulU7d+7UjBkzJEklJSVKSUnR3r17NWvWLJ0+fVo+n0+VlZXKzMyUJG3ZskUej0c1NTVKTU3tiWMHAADo925ftcf++dzah/qwJcDQcFPPIAUCAUlSXFycJOns2bPy+/3Kzs62a5xOp6ZMmaKKigpJUlVVlVpbW0Nq3G630tLS7JpDhw7JMAw7HElSVlaWDMOwa9oKBoNqamoKmQAAAAAgHN0OSKZpavny5br//vuVlpYmSfL7/ZKkpKSkkNqkpCR7nd/vV1RUlEaNGtVlTWJiYrvfmZiYaNe0VVRUZD+vZBiGUlJSuntoAAAAAIaobgekp556Sr/73e/0i1/8ot06h8MRMm+aZrtlbbWt6ai+q/2sXr1agUDAnmpra2/kMAAAAADA1q2AtHTpUr355pvav3+/Ro8ebS93uVyS1K6Xp76+3u5VcrlcamlpUUNDQ5c1Fy5caPd7L1682K536hqn06nY2NiQCQAAAADCEVZAMk1TTz31lF5//XXt27dP48aNC1k/btw4uVwulZWV2ctaWlpUXl6uyZMnS5IyMjIUGRkZUlNXV6eTJ0/aNR6PR4FAQEeOHLFrDh8+rEAgYNcAAAAAQE8LaxS7JUuW6JVXXtEvf/lLxcTE2D1FhmFoxIgRcjgcKigo0Jo1azR+/HiNHz9ea9as0ciRI5Wfn2/XLly4UCtWrFB8fLzi4uK0cuVKpaen26PaTZgwQbNnz9aiRYu0efNmSdLixYuVm5vLCHYAAAAAek1YAWnTpk2SpKlTp4Ys37ZtmxYsWCBJevrpp9Xc3Kwnn3xSDQ0NyszM1DvvvKOYmBi7fsOGDYqIiNDcuXPV3Nys6dOna/v27Ro2bJhds2vXLi1btswe7S4vL0/FxcXdOUYAAAAAuCFhBSTTNK9b43A4VFhYqMLCwk5rhg8fro0bN2rjxo2d1sTFxamkpCSc5gEAAADATQkrIAEAALTFF5kCGEwISH2EFxMAAACg/+n29yABAAAAwGBDDxIAABi0uGMDQLgISIPIZ18EJF4IAAD4LF4nAdwIbrEDAAAAAAsBCQAAAAAsBCQAAAAAsBCQAAAAAMBCQAIAAAAACwEJAAAAACwEJAAAAACwEJAAAAAAwMIXxQIAAHQTXz4LDD70IAEAAACAhR4kAACANj7bMxROr1BX23V3nwBuLQISAAAY8treKgdg6CIg9QPcvwwAAAD0DzyDBAAAAAAWAhIAAAAAWLjFDgAAhIXndQAMZgSkAY4XKQAA+ideo4GBiVvsAAAAAMBCDxIAAEAXbkVPEL1NQP9BDxIAAAAAWMIOSO+++64efvhhud1uORwOvfHGGyHrFyxYIIfDETJlZWWF1ASDQS1dulQJCQmKjo5WXl6ezp8/H1LT0NAgr9crwzBkGIa8Xq8aGxvDPkAAAICO3L5qjz0BwDVh32J35coVTZw4Ud/+9rf12GOPdVgze/Zsbdu2zZ6PiooKWV9QUKBf/epXKi0tVXx8vFasWKHc3FxVVVVp2LBhkqT8/HydP39ePp9PkrR48WJ5vV796le/CrfJAAAA/QqhDOi/wg5IOTk5ysnJ6bLG6XTK5XJ1uC4QCGjr1q3auXOnZsyYIUkqKSlRSkqK9u7dq1mzZun06dPy+XyqrKxUZmamJGnLli3yeDyqqalRampquM0e0D77n+i5tQ/1YUsAAACAwa1XnkE6cOCAEhMTdeedd2rRokWqr6+311VVVam1tVXZ2dn2MrfbrbS0NFVUVEiSDh06JMMw7HAkSVlZWTIMw65pKxgMqqmpKWQCAACDD7fGAehNPR6QcnJytGvXLu3bt08vvPCCjh49qmnTpikYDEqS/H6/oqKiNGrUqJDtkpKS5Pf77ZrExMR2+05MTLRr2ioqKrKfVzIMQykpKT18ZAAAAAAGux4f5nvevHn2z2lpaZo0aZLGjh2rPXv2aM6cOZ1uZ5qmHA6HPf/Znzur+azVq1dr+fLl9nxTU9OgDEl8WgYAGCy4hRxAf9Tr34OUnJyssWPH6syZM5Ikl8ullpYWNTQ0hPQi1dfXa/LkyXbNhQsX2u3r4sWLSkpK6vD3OJ1OOZ3OXjiCW48QBAAAAPSNXv8epI8++ki1tbVKTk6WJGVkZCgyMlJlZWV2TV1dnU6ePGkHJI/Ho0AgoCNHjtg1hw8fViAQsGsAAAAAoKeF3YN0+fJl/fGPf7Tnz549q+rqasXFxSkuLk6FhYV67LHHlJycrHPnzumZZ55RQkKCvv71r0uSDMPQwoULtWLFCsXHxysuLk4rV65Uenq6PardhAkTNHv2bC1atEibN2+W9NdhvnNzc4fcCHYAAAAAbp2wA9KxY8f04IMP2vPXnvuZP3++Nm3apBMnTujll19WY2OjkpOT9eCDD+rVV19VTEyMvc2GDRsUERGhuXPnqrm5WdOnT9f27dvt70CSpF27dmnZsmX2aHd5eXkqLi7u9oECAAAAwPWEHZCmTp0q0zQ7Xf/2229fdx/Dhw/Xxo0btXHjxk5r4uLiVFJSEm7zAAAAAKDbev0ZJAAAAAAYKHp9FDsAAIDe0nbkV4YLB3Cz6EECAAAAAAs9SAAAYNDguwQB3Cx6kAAAAADAQkACAAAAAAsBCQAAAAAsBCQAAAAAsBCQAAAAAMDCKHYAAKDP8X1GAPoLAhIAAIPMZ8MGQQMAwsMtdgAAAABgISABAAAAgIWABAAAAAAWnkECAAA9hsEWAAx09CABAAAAgIWABAAAAAAWbrEDAAAYILiFEeh99CABAAAAgIWABAAAAAAWbrEDAAC95rO3hHE7GICBgIAEAABuibbPz/RULQD0JG6xAwAAAAALAQkAAAAALNxiBwAAMAgwBDjQM8LuQXr33Xf18MMPy+12y+Fw6I033ghZb5qmCgsL5Xa7NWLECE2dOlWnTp0KqQkGg1q6dKkSEhIUHR2tvLw8nT9/PqSmoaFBXq9XhmHIMAx5vV41NjaGfYAAAAAAcKPCDkhXrlzRxIkTVVxc3OH6devWaf369SouLtbRo0flcrk0c+ZMXbp0ya4pKCjQ7t27VVpaqoMHD+ry5cvKzc3V1atX7Zr8/HxVV1fL5/PJ5/OpurpaXq+3G4cIAAAwON2+ao89AegZYd9il5OTo5ycnA7XmaapF198Uc8++6zmzJkjSdqxY4eSkpL0yiuv6PHHH1cgENDWrVu1c+dOzZgxQ5JUUlKilJQU7d27V7NmzdLp06fl8/lUWVmpzMxMSdKWLVvk8XhUU1Oj1NTU7h4vAAAAAHSqRwdpOHv2rPx+v7Kzs+1lTqdTU6ZMUUVFhSSpqqpKra2tITVut1tpaWl2zaFDh2QYhh2OJCkrK0uGYdg1bQWDQTU1NYVMAAAgFD0OANC1Hg1Ifr9fkpSUlBSyPCkpyV7n9/sVFRWlUaNGdVmTmJjYbv+JiYl2TVtFRUX280qGYSglJeWmjwcAAADA0NIrw3w7HI6QedM02y1rq21NR/Vd7Wf16tUKBAL2VFtb242WAwAAABjKejQguVwuSWrXy1NfX2/3KrlcLrW0tKihoaHLmgsXLrTb/8WLF9v1Tl3jdDoVGxsbMgEAAABAOHo0II0bN04ul0tlZWX2spaWFpWXl2vy5MmSpIyMDEVGRobU1NXV6eTJk3aNx+NRIBDQkSNH7JrDhw8rEAjYNQAwkH32ORCeBQHQG/g/BuiesEexu3z5sv74xz/a82fPnlV1dbXi4uI0ZswYFRQUaM2aNRo/frzGjx+vNWvWaOTIkcrPz5ckGYahhQsXasWKFYqPj1dcXJxWrlyp9PR0e1S7CRMmaPbs2Vq0aJE2b94sSVq8eLFyc3MZwQ4AAABArwk7IB07dkwPPvigPb98+XJJ0vz587V9+3Y9/fTTam5u1pNPPqmGhgZlZmbqnXfeUUxMjL3Nhg0bFBERoblz56q5uVnTp0/X9u3bNWzYMLtm165dWrZsmT3aXV5eXqffvQQAAG5eVz0N59Y+dAtbAgB9J+yANHXqVJmm2el6h8OhwsJCFRYWdlozfPhwbdy4URs3buy0Ji4uTiUlJeE2DwAGpM++MeWNKAAAfadXRrEDAAAAgIEo7B4kAAAw9PCgP4ChgoA0iHHLDgDcGm3DQ3/6P5dgAwDh4RY7AAAAALAQkAAAAADAwi12AAAMUdx+BwDtEZAA3BCeaQMAAEMBt9gBAAAAgIUeJAAd4tYbDCb9tQf0Zka/66/HBAADHT1IAAAAAGChBwkA2ujP32kDAAB6Fz1IAAAAAGChBwkAgAGA5wIB4NagBwkAAAAALPQgAQAADHGMigj8fwQkYAjr7mAEQ30QA95I9H9D7Xa0oXa8ANCbCEjAEMKbKGBg4W8WPWWof7AFhIOABADX0dWb1K7edPDmtnP0wgEA+isCEoAhiU9T0R8RqgGg7xGQAEC8MUXvIYwDwMBCQAIGoHDezIfzZmwghoTeOhfd1RPnsDfeUPMmHQCAG0NAGiJ4c4T+jmsU/RHPSgHA0ENAQju8IRhcBmKvUFuEJwAAcKsQkDAo3kCjb3U3VIczOlxP4FoHAADXQ0DCoEHPV//EvwsA9D98YAR07nM9vcPCwkI5HI6QyeVy2etN01RhYaHcbrdGjBihqVOn6tSpUyH7CAaDWrp0qRISEhQdHa28vDydP3++p5sKYIC6fdUeewIAAOhJvdKD9OUvf1l79+6154cNG2b/vG7dOq1fv17bt2/XnXfeqR/84AeaOXOmampqFBMTI0kqKCjQr371K5WWlio+Pl4rVqxQbm6uqqqqQvYFoP8htAD8HQDAQNYrASkiIiKk1+ga0zT14osv6tlnn9WcOXMkSTt27FBSUpJeeeUVPf744woEAtq6dat27typGTNmSJJKSkqUkpKivXv3atasWR3+zmAwqGAwaM83NTX1wpFhoBgMD/UPhmNA7+vqjTjXTP9EeAKA/q1XAtKZM2fkdrvldDqVmZmpNWvW6Itf/KLOnj0rv9+v7Oxsu9bpdGrKlCmqqKjQ448/rqqqKrW2tobUuN1upaWlqaKiotOAVFRUpOeff743DmdQ4gW6fxrs/y6D/fgAYDDgAzoMdT0ekDIzM/Xyyy/rzjvv1IULF/SDH/xAkydP1qlTp+T3+yVJSUlJIdskJSXp/ffflyT5/X5FRUVp1KhR7Wqubd+R1atXa/ny5fZ8U1OTUlJSeuqwhqyu/pPs6/9A+/ObbQYmAAAAGJh6PCDl5OTYP6enp8vj8eiOO+7Qjh07lJWVJUlyOBwh25im2W5ZW9ercTqdcjqdN9Fy3IgbDSV9HZ76s94YEhuDW28E7hu9Na8vbuHjAwYAQF/q9WG+o6OjlZ6erjNnzujRRx+V9NdeouTkZLumvr7e7lVyuVxqaWlRQ0NDSC9SfX29Jk+e3NvNBW4pQg/CxTXTf/BvAQCDU68HpGAwqNOnT+trX/uaxo0bJ5fLpbKyMt17772SpJaWFpWXl+uHP/yhJCkjI0ORkZEqKyvT3LlzJUl1dXU6efKk1q1b19vNxU3o7puFW/EJ9a3+BB7AjbvVf0vd7eHmbx4AhoYeD0grV67Uww8/rDFjxqi+vl4/+MEP1NTUpPnz58vhcKigoEBr1qzR+PHjNX78eK1Zs0YjR45Ufn6+JMkwDC1cuFArVqxQfHy84uLitHLlSqWnp9uj2mHo6E+32vRGAAR6U29fe9xKCwAYjHo8IJ0/f17f/OY39eGHH+q2225TVlaWKisrNXbsWEnS008/rebmZj355JNqaGhQZmam3nnnHfs7kCRpw4YNioiI0Ny5c9Xc3Kzp06dr+/btfAfSIEFgAABg4OhPH1YCt0KPB6TS0tIu1zscDhUWFqqwsLDTmuHDh2vjxo3auHFjD7cOAxnBCrg1+FsDAAxlvf4MEtDfcFsQ0DtuxbN+vfH3yqfjAIDPIiBhyOvqzRGfpAPd01vBpi+GHQcADC0EJNwSAyVoDJR2AgDQF7gLA0MBAQkAMCj0xAccfEgCAPhcXzcAAAAAAPoLepAAAL2OnhlgcGKQEwxG9CABAAAAgIWABAAAAAAWAhIAAAAAWHgGCQAAADeNIcAxWNCDBAAAAAAWAhIAAAAAWLjFDgAAAD2OIcAxUNGDBAAAAAAWAhIAAAAAWLjFDgAAAL2q7Qh3n8Xtd+hv6EECAAAAAAsBCQAAAAAs3GIHAACAfoPR79DXCEgAAADoM109nwT0BW6xAwAAAAALPUgAAADolxj9Dn2BgAQAAIABp214IjChpxCQAAAAMKgQnnAzCEgAAAAY8Lq6HY9b9RCOfj9Iw0svvaRx48Zp+PDhysjI0G9+85u+bhIAAAAGidtX7bEnQOrnPUivvvqqCgoK9NJLL+mrX/2qNm/erJycHL333nsaM2ZMXzcPAAAAg8jNhCR6ogYPh2maZl83ojOZmZn6yle+ok2bNtnLJkyYoEcffVRFRUUhtcFgUMFg0J4PBAIaM2aMamtrFRsbe8va3Jm0597u6yYAAACgnzn5/Ky+bsKQ0NTUpJSUFDU2NsowjC5r+20PUktLi6qqqrRq1aqQ5dnZ2aqoqGhXX1RUpOeff77d8pSUlF5rIwAAAHAzjBf7ugVDy6VLlwZuQPrwww919epVJSUlhSxPSkqS3+9vV7969WotX77cnv/000/1f//3f4qPj5fD4ej19nblWmLtL71ZQw3nv+9w7vsW57/vcO77Fue/b3H++w7nvnOmaerSpUtyu93Xre23AematuHGNM0OA4/T6ZTT6QxZ9oUvfKE3mxa22NhYLtY+xPnvO5z7vsX57zuc+77F+e9bnP++w7nv2PV6jq7pt6PYJSQkaNiwYe16i+rr69v1KgEAAABAT+i3ASkqKkoZGRkqKysLWV5WVqbJkyf3UasAAAAADGb9+ha75cuXy+v1atKkSfJ4PPrZz36mP//5z3riiSf6umlhcTqdeu6559rdAohbg/Pfdzj3fYvz33c4932L89+3OP99h3PfM/r1MN/SX78odt26daqrq1NaWpo2bNigBx54oK+bBQAAAGAQ6vcBCQAAAABulX77DBIAAAAA3GoEJAAAAACwEJAAAAAAwEJAAgAAAAALAamHvPTSSxo3bpyGDx+ujIwM/eY3v+myvry8XBkZGRo+fLi++MUv6qc//ektaungUlRUpPvuu08xMTFKTEzUo48+qpqami63OXDggBwOR7vp97///S1q9eBQWFjY7hy6XK4ut+G67zm33357h9fxkiVLOqznuu++d999Vw8//LDcbrccDofeeOONkPWmaaqwsFBut1sjRozQ1KlTderUqevu97XXXtPdd98tp9Opu+++W7t37+6lIxjYujr/ra2t+v73v6/09HRFR0fL7XbrH/7hH/TBBx90uc/t27d3+Pfwl7/8pZePZuC53vW/YMGCducxKyvruvvl+r++6537jq5hh8OhH/3oR53uk2v/xhCQesCrr76qgoICPfvsszp+/Li+9rWvKScnR3/+8587rD979qz+7u/+Tl/72td0/PhxPfPMM1q2bJlee+21W9zyga+8vFxLlixRZWWlysrK9Mknnyg7O1tXrly57rY1NTWqq6uzp/Hjx9+CFg8uX/7yl0PO4YkTJzqt5brvWUePHg0599e+VPvv//7vu9yO6z58V65c0cSJE1VcXNzh+nXr1mn9+vUqLi7W0aNH5XK5NHPmTF26dKnTfR46dEjz5s2T1+vV//zP/8jr9Wru3Lk6fPhwbx3GgNXV+f/444/129/+Vv/8z/+s3/72t3r99df1hz/8QXl5edfdb2xsbMjfQl1dnYYPH94bhzCgXe/6l6TZs2eHnMe33nqry31y/d+Y6537ttfvz3/+czkcDj322GNd7pdr/waYuGl/+7d/az7xxBMhy+666y5z1apVHdY//fTT5l133RWy7PHHHzezsrJ6rY1DRX19vSnJLC8v77Rm//79piSzoaHh1jVsEHruuefMiRMn3nA9133v+t73vmfecccd5qefftrheq77niHJ3L17tz3/6aefmi6Xy1y7dq297C9/+YtpGIb505/+tNP9zJ0715w9e3bIslmzZpnf+MY3erzNg0nb89+RI0eOmJLM999/v9Oabdu2mYZh9GzjhoCOzv/8+fPNRx55JKz9cP2H70au/UceecScNm1alzVc+zeGHqSb1NLSoqqqKmVnZ4csz87OVkVFRYfbHDp0qF39rFmzdOzYMbW2tvZaW4eCQCAgSYqLi7tu7b333qvk5GRNnz5d+/fv7+2mDUpnzpyR2+3WuHHj9I1vfEN/+tOfOq3luu89LS0tKikp0Xe+8x05HI4ua7nue9bZs2fl9/tDrm2n06kpU6Z0+hogdf730NU2uDGBQEAOh0Nf+MIXuqy7fPmyxo4dq9GjRys3N1fHjx+/NQ0chA4cOKDExETdeeedWrRokerr67us5/rveRcuXNCePXu0cOHC69Zy7V8fAekmffjhh7p69aqSkpJCliclJcnv93e4jd/v77D+k08+0YcffthrbR3sTNPU8uXLdf/99ystLa3TuuTkZP3sZz/Ta6+9ptdff12pqamaPn263n333VvY2oEvMzNTL7/8st5++21t2bJFfr9fkydP1kcffdRhPdd973njjTfU2NioBQsWdFrDdd87rv0/H85rwLXtwt0G1/eXv/xFq1atUn5+vmJjYzutu+uuu7R9+3a9+eab+sUvfqHhw4frq1/9qs6cOXMLWzs45OTkaNeuXdq3b59eeOEFHT16VNOmTVMwGOx0G67/nrdjxw7FxMRozpw5XdZx7d+YiL5uwGDR9lNb0zS7/CS3o/qOluPGPfXUU/rd736ngwcPdlmXmpqq1NRUe97j8ai2tlb/+q//qgceeKC3mzlo5OTk2D+np6fL4/Hojjvu0I4dO7R8+fIOt+G67x1bt25VTk6O3G53pzVc970r3NeA7m6DzrW2tuob3/iGPv30U7300ktd1mZlZYUMJPDVr35VX/nKV7Rx40b95Cc/6e2mDirz5s2zf05LS9OkSZM0duxY7dmzp8s361z/PevnP/+5vvWtb133WSKu/RtDD9JNSkhI0LBhw9p96lFfX9/u05FrXC5Xh/URERGKj4/vtbYOZkuXLtWbb76p/fv3a/To0WFvn5WVxacnNyk6Olrp6emdnkeu+97x/vvva+/evfrHf/zHsLflur9510ZuDOc14Np24W6DzrW2tmru3Lk6e/asysrKuuw96sjnPvc53Xffffw99IDk5GSNHTu2y3PJ9d+zfvOb36impqZbrwNc+x0jIN2kqKgoZWRk2CNIXVNWVqbJkyd3uI3H42lX/84772jSpEmKjIzstbYORqZp6qmnntLrr7+uffv2ady4cd3az/Hjx5WcnNzDrRtagsGgTp8+3el55LrvHdu2bVNiYqIeeuihsLflur9548aNk8vlCrm2W1paVF5e3ulrgNT530NX26Bj18LRmTNntHfv3m594GKapqqrq/l76AEfffSRamtruzyXXP89a+vWrcrIyNDEiRPD3pZrvxN9NTrEYFJaWmpGRkaaW7duNd977z2zoKDAjI6ONs+dO2eapmmuWrXK9Hq9dv2f/vQnc+TIkeY//dM/me+99565detWMzIy0vzP//zPvjqEAeu73/2uaRiGeeDAAbOurs6ePv74Y7um7fnfsGGDuXv3bvMPf/iDefLkSXPVqlWmJPO1117ri0MYsFasWGEeOHDA/NOf/mRWVlaaubm5ZkxMDNf9LXT16lVzzJgx5ve///1267jue86lS5fM48ePm8ePHzclmevXrzePHz9uj5K2du1a0zAM8/XXXzdPnDhhfvOb3zSTk5PNpqYmex9erzdkZNP//u//NocNG2auXbvWPH36tLl27VozIiLCrKysvOXH1991df5bW1vNvLw8c/To0WZ1dXXI60AwGLT30fb8FxYWmj6fz/zf//1f8/jx4+a3v/1tMyIiwjx8+HBfHGK/1tX5v3TpkrlixQqzoqLCPHv2rLl//37T4/GYf/M3f8P13wOu93+PaZpmIBAwR44caW7atKnDfXDtdw8BqYf827/9mzl27FgzKirK/MpXvhIyzPT8+fPNKVOmhNQfOHDAvPfee82oqCjz9ttv7/TCRtckdTht27bNrml7/n/4wx+ad9xxhzl8+HBz1KhR5v3332/u2bPn1jd+gJs3b56ZnJxsRkZGmm6325wzZ4556tQpez3Xfe97++23TUlmTU1Nu3Vc9z3n2hDpbaf58+ebpvnXob6fe+450+VymU6n03zggQfMEydOhOxjypQpdv01//Ef/2GmpqaakZGR5l133UVY7URX5//s2bOdvg7s37/f3kfb819QUGCOGTPGjIqKMm+77TYzOzvbrKiouPUHNwB0df4//vhjMzs727ztttvMyMhIc8yYMeb8+fPNP//5zyH74Prvnuv932Oaprl582ZzxIgRZmNjY4f74NrvHodpWk9JAwAAAMAQxzNIAAAAAGAhIAEAAACAhYAEAAAAABYCEgAAAABYCEgAAAAAYCEgAQAAAICFgAQAAAAAFgISAAAAAFgISAAAAABgISABAAAAgIWABAAAAACW/wcYxtnyre/F7QAAAABJRU5ErkJggg==", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plt.figure(figsize=(10,4))\n", "\n", "plt.hist(np.log1p(target.loc[target['target'] > 0, 'target']), bins = 200);" ] }, { "cell_type": "markdown", "id": "38caffeb-d66a-4393-bfdc-9ece2ec720ac", "metadata": {}, "source": [ "Распределение выглядит очень похожим на смесь:\n", "- Есть клиенты с значениями около нуля\n", "- Есть компонента смеси с центром в районе 6, то есть ~400 (np.exp(6) - 1)\n", "- Есть компонента справа, с центров в районе 13, то есть ~440,000\n", "- И есть еще клиенты с ровно 0, которых мы убрали с графика\n", "\n", "Выглядит заманчиво и для ML, и визуализации. Но нас пока интересует только сабмит.\n", "\n", "### Catboost\n", "\n", "Начнём собирать всё что нам потребуется дя обучения Catboost-а.\n", "\n", "- Будем ли мы проверять, что порядок `user_id` полностью совпадает в train и target?\n", "- Будем ли мы сразу настраивать свою валидацию и делить данные?\n", "- Или может быть будем что-либо преобразовывать?\n", "\n", "Нет, нас интересует atboost сабмит ASAP 🤗️️️️️️ " ] }, { "cell_type": "code", "execution_count": 13, "id": "378306c2-1373-4868-96d2-0e9b54bcde9f", "metadata": {}, "outputs": [], "source": [ "train_pool = Pool(data = train, \n", " label = np.log1p(target['target']), \n", " cat_features = categorical_features_indices)" ] }, { "cell_type": "markdown", "id": "e35224b7-1329-47ce-8ec6-e69d0ae6a0a9", "metadata": {}, "source": [ "#### Обучение\n", "\n", "Главные настройки, которые нам стоит учесть:\n", "\n", "- Так как метрика соревнования это RMSLE, а мы уже логарифмировали (log1p) целевую переменную, оптимизировать мы будем RMSE\n", "- У нас много пропусков в данных, поэтому нам очень повезло что у Catboost есть настройка nan_mode" ] }, { "cell_type": "code", "execution_count": 14, "id": "ff688458-d150-42e8-99d1-b8aced7ffd61", "metadata": {}, "outputs": [], "source": [ "model = CatBoostRegressor(iterations = 100, \n", " depth = 6, \n", " learning_rate = 0.1, \n", " loss_function = 'RMSE', \n", " nan_mode = 'Min', \n", " random_seed = 314,\n", " verbose = 10)\n" ] }, { "cell_type": "code", "execution_count": 15, "id": "3cd7c516-80b2-438c-ac59-f14e181c09ee", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0:\tlearn: 5.1242446\ttotal: 128ms\tremaining: 12.7s\n", "10:\tlearn: 2.7903763\ttotal: 891ms\tremaining: 7.21s\n", "20:\tlearn: 2.2642536\ttotal: 1.65s\tremaining: 6.21s\n", "30:\tlearn: 2.1466213\ttotal: 2.33s\tremaining: 5.18s\n", "40:\tlearn: 2.1033289\ttotal: 3s\tremaining: 4.31s\n", "50:\tlearn: 2.0807207\ttotal: 3.63s\tremaining: 3.48s\n", "60:\tlearn: 2.0606817\ttotal: 4.26s\tremaining: 2.72s\n", "70:\tlearn: 2.0467002\ttotal: 4.91s\tremaining: 2s\n", "80:\tlearn: 2.0315319\ttotal: 5.6s\tremaining: 1.31s\n", "90:\tlearn: 2.0204322\ttotal: 6.22s\tremaining: 615ms\n", "99:\tlearn: 2.0125081\ttotal: 6.84s\tremaining: 0us\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model.fit(train_pool)" ] }, { "cell_type": "markdown", "id": "fe974317-8a22-4a11-abf7-298245c3e3a9", "metadata": {}, "source": [ "Мы успешно обучили модель 🌟️️️️️️\n", "\n", "И вправду — зачем нам валидация, если можно ее сразу отправить в соревнование и узнать наш результат на лидерборде? Он же не будет прямо сильно хуже чем в логе обучения? (ведь правда, да?)\n", "\n", "### Подготовка сабмита\n", "\n", "Посмотрим на пример рабочего бейзлайн решения. \n", "\n", "Именно в таком формате платформа ждет от нас решения:" ] }, { "cell_type": "code", "execution_count": 16, "id": "456c9f2e-e51c-4419-a402-871cfd1322d6", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(318451, 2)" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sample = pd.read_csv('data/task3/sample_submit_naive.csv')\n", "sample.shape" ] }, { "cell_type": "code", "execution_count": 17, "id": "da43b692-9ec7-415d-9c61-42ffaaf6cdf5", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
user_idpredict
010000081.004656e+06
110000090.000000e+00
210000135.047758e+02
310000161.680799e+05
410000172.222542e+02
\n", "
" ], "text/plain": [ " user_id predict\n", "0 1000008 1.004656e+06\n", "1 1000009 0.000000e+00\n", "2 1000013 5.047758e+02\n", "3 1000016 1.680799e+05\n", "4 1000017 2.222542e+02" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sample.head(5)" ] }, { "cell_type": "markdown", "id": "526becca-affc-439c-a378-15aa630705f3", "metadata": {}, "source": [ "С форматом тоже всё понятно. \n", "\n", "Важно заметить, что предсказания от нас ждут без преобразований целевой переменной, так что нужно будет сделать обратные преобразования предсказаний нашей модели.\n", "\n", "#### Использование модели \n", "\n", "Тестовые данные у нас уже есть, но их нужно подготовить для формата Catboost-а." ] }, { "cell_type": "code", "execution_count": 18, "id": "4ba55359-c12f-49af-9ddd-7cc50594f2d8", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
user_idapp_children_cntapp_dependent_cntapp_family_cntapp_income_appapp_real_estate_indapp_vehicle_indavg_dep_avg_balance_12month_amtavg_dep_avg_balance_12month_amt_termavg_dep_avg_balance_12month_amt_term_savings...savings_sum_oms_debet_3msavings_sum_oms_debet_6msavings_sum_oms_debet_9msavings_sum_oms_debet_12msavings_service_model_cdsavings_pension_flgsavings_deposit_flgsavings_safe_acc_flgsavings_broker_flgsavings_oms_flg
01000008NaNNaNNaNNaNNaNNaN998138.5625002678.6992191.009246e+06...0.0000000.00000067.8935090.000000Массовый00100
110000090.0NaNNaN29125.3945310.00.00.000030NaNNaN...8.40705054.11141670.21389082.739632Массовый00100
210000130.00.00.059536.8164060.00.054086.031250NaN3.513455e+04...0.0000000.0000000.00000056.554066Массовый00100
310000160.0NaNNaN66908.4687500.00.060340.105469NaN6.347482e+04...33.32173259.4613990.0000000.000000Массовый00110
41000017NaNNaNNaNNaNNaNNaN0.000030NaN0.000000e+00...26.5271950.00000056.96283059.217648Массовый00100
\n", "

5 rows × 280 columns

\n", "
" ], "text/plain": [ " user_id app_children_cnt app_dependent_cnt app_family_cnt \\\n", "0 1000008 NaN NaN NaN \n", "1 1000009 0.0 NaN NaN \n", "2 1000013 0.0 0.0 0.0 \n", "3 1000016 0.0 NaN NaN \n", "4 1000017 NaN NaN NaN \n", "\n", " app_income_app app_real_estate_ind app_vehicle_ind \\\n", "0 NaN NaN NaN \n", "1 29125.394531 0.0 0.0 \n", "2 59536.816406 0.0 0.0 \n", "3 66908.468750 0.0 0.0 \n", "4 NaN NaN NaN \n", "\n", " avg_dep_avg_balance_12month_amt avg_dep_avg_balance_12month_amt_term \\\n", "0 998138.562500 2678.699219 \n", "1 0.000030 NaN \n", "2 54086.031250 NaN \n", "3 60340.105469 NaN \n", "4 0.000030 NaN \n", "\n", " avg_dep_avg_balance_12month_amt_term_savings ... \\\n", "0 1.009246e+06 ... \n", "1 NaN ... \n", "2 3.513455e+04 ... \n", "3 6.347482e+04 ... \n", "4 0.000000e+00 ... \n", "\n", " savings_sum_oms_debet_3m savings_sum_oms_debet_6m \\\n", "0 0.000000 0.000000 \n", "1 8.407050 54.111416 \n", "2 0.000000 0.000000 \n", "3 33.321732 59.461399 \n", "4 26.527195 0.000000 \n", "\n", " savings_sum_oms_debet_9m savings_sum_oms_debet_12m \\\n", "0 67.893509 0.000000 \n", "1 70.213890 82.739632 \n", "2 0.000000 56.554066 \n", "3 0.000000 0.000000 \n", "4 56.962830 59.217648 \n", "\n", " savings_service_model_cd savings_pension_flg savings_deposit_flg \\\n", "0 Массовый 0 0 \n", "1 Массовый 0 0 \n", "2 Массовый 0 0 \n", "3 Массовый 0 0 \n", "4 Массовый 0 0 \n", "\n", " savings_safe_acc_flg savings_broker_flg savings_oms_flg \n", "0 1 0 0 \n", "1 1 0 0 \n", "2 1 0 0 \n", "3 1 1 0 \n", "4 1 0 0 \n", "\n", "[5 rows x 280 columns]" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "test.head(5)" ] }, { "cell_type": "code", "execution_count": 19, "id": "3a7a771e-ed7d-4aa9-bf43-cfaaec8a707a", "metadata": {}, "outputs": [], "source": [ "test_pool = Pool(data = test, \n", " cat_features = categorical_features_indices)" ] }, { "cell_type": "code", "execution_count": 20, "id": "684501d8-d52a-4cbf-8e68-fdd1f7f8e246", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(318451,)" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "test_predict = model.predict(test_pool)\n", "test_predict.shape" ] }, { "cell_type": "markdown", "id": "bd893712-bddb-485a-adad-9ec65b0ed3e5", "metadata": {}, "source": [ "Обратные преобразования (не забываем -1):" ] }, { "cell_type": "code", "execution_count": 21, "id": "e19e0142-4df9-445b-9ecd-c6514ba27415", "metadata": {}, "outputs": [], "source": [ "test_full_predict = np.exp(test_predict) - 1" ] }, { "cell_type": "markdown", "id": "f3a99017-1335-4a4c-b883-d7d68c462e28", "metadata": {}, "source": [ "#### Упаковка сабмита\n", "\n", "Так как мы торопимся отправить решение, мы снова доверимся воле случая, что все 'user_id' отсортированы за нас 🌚️️️️️️\n", "\n", "И мы просто перепишем предсказания в исходном сабмит файле." ] }, { "cell_type": "code", "execution_count": 22, "id": "d70eb17b-caa6-4c05-9748-d918c3715171", "metadata": {}, "outputs": [], "source": [ "sample['predict'] = test_full_predict" ] }, { "cell_type": "code", "execution_count": 23, "id": "9f54656f-1491-4af6-aaa5-697155bf97a5", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
user_idpredict
01000008290314.317272
11000009-0.030440
2100001311.244460
31000016271.858813
4100001712.335367
\n", "
" ], "text/plain": [ " user_id predict\n", "0 1000008 290314.317272\n", "1 1000009 -0.030440\n", "2 1000013 11.244460\n", "3 1000016 271.858813\n", "4 1000017 12.335367" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sample.head(5)" ] }, { "cell_type": "markdown", "id": "f819060c-0fc8-46a0-8304-b58e85f657e8", "metadata": {}, "source": [ "Финишная прямая — пишем файл:\n" ] }, { "cell_type": "code", "execution_count": 24, "id": "0496c64c-87b0-4545-9e34-fb0f99f3354b", "metadata": {}, "outputs": [], "source": [ "sample.to_csv('submit_baseline_catboost.csv', index=False)" ] }, { "cell_type": "markdown", "id": "82e301c7-379c-4ec1-a861-477d2f2b86a6", "metadata": {}, "source": [ "И... результат на паблике это 4.169190539294088\t\n", "\n", "- Это лучше чем наивный сабмит с 5.848489205052006\n", "- Но это точно не метрика, показываемая в логах при обучении\n", "- И этот результат вряд ли пошёл бы в продакшен\n", "\n", "Можно ли это улучшить? " ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.2" } }, "nbformat": 4, "nbformat_minor": 5 }