Parses Viber exports into a Pandas DataFrame and generates the following statistics:
- Total number of messages
- Number of unique participants
- Date range of conversations
- Average messages per day
- Distribution of text vs. media messages
- Activity over time (daily message count chart)
- Activity by hour of day (bar chart showing when people are most active)
- Activity by day of week (which days have the most chat activity)
- Activity heatmap (shows patterns across both day and hour)
- Message distribution by sender (who talks the most)
- Response time analysis (how quickly each person typically responds)
- Average message length by sender
- Word cloud visualization of frequently used words
- Most common words used in the chat
- Emoji usage analysis
- Conversation length statistics
The repository includes an example_output_directory
where you can explore the example output generated by running ViberChatStatAnalyzer on the dummy file example_viber_export.csv
.
ViberChatStatAnalyzer works with CSV files exported from Viber. To generate the required export file, follow these steps:
- Open the Viber app on your phone.
- Go to Settings > Calls and Messages.
- Scroll down and select the Email Message History option.
- Viber will create a ZIP file containing all selected chat histories in CSV format.
- Send this ZIP file to your email or transfer it to your computer using another method.
- Extract the contents of the ZIP file.
- Locate and open the CSV file corresponding to the chat you're interested in.
You can now use this CSV file as input for ViberChatStatAnalyzer to explore chat statistics and insights. For an example of how the exported file looks like, check out example_viber_export.csv
.
git clone https://github.com/Ambeteco/ViberChatStatAnalyzer.git
cd ViberChatStatAnalyzer
python -m pip install -r requirements.txt -U
The script supports direct command-line execution. This is the recommended method of use for most users. This will also automatically download "punkt", "stopwords", and "punkt_tab", which are required for the script to work.
python viber_chat_stat_analyzer/main.py "path/to/viber_export.csv" -o "output_directory"
The program will parse your chat file ("path/to/viber_export.csv"), generate all the insights, and save them as images and text files to the specified directory ("output_directory"). This will give you a complete picture of your conversation patterns, including who talks most, when conversations typically happen, common topics, and more.
Apart from CLI, you can use ViberChatStatAnalyzer as a library.
from viber_chat_stat_analyzer import ViberChatAnalyzer
# Create an analyzer instance with your Viber chat export file
analyzer = ViberChatAnalyzer("viber_export.csv")
analyzer.generate_all_insights("output_directory")
# ^^^^ Generate *all* insights in one go:
# - Writes basic_stats.txt, language_stats.txt
# - CSVs: response_times.csv, conversation_stats.csv
# - PNGs: all of the plots: activity_over_time.png, activity_by_hour.png, activity_by_day.png, sender_distribution.png, media_vs_text.png, response_times.png, word_cloud.png, activity_heatmap.png.
...
# Interactive Visualizations (will open matplotlib windows)
analyzer.plot_activity_over_time()
analyzer.plot_activity_by_hour()
analyzer.plot_activity_by_day_of_week()
analyzer.plot_sender_distribution()
analyzer.plot_media_vs_text()
analyzer.plot_response_times()
analyzer.generate_word_cloud()
analyzer.generate_heatmap_activity()
# Save visualizations directly to image files
analyzer.plot_activity_over_time("out/activity_over_time.png")
analyzer.plot_activity_by_hour("out/activity_by_hour.png")
analyzer.plot_activity_by_day_of_week("out/activity_by_day.png")
analyzer.plot_sender_distribution("out/sender_distribution.png")
analyzer.plot_media_vs_text("out/media_vs_text.png")
analyzer.plot_response_times("out/response_times.png")
analyzer.generate_word_cloud("out/word_cloud.png")
analyzer.generate_heatmap_activity("out/activity_heatmap.png")
# Get basic chat statistics
basic_stats = analyzer.get_basic_stats()
print(basic_stats)
# {'total_messages': 36, 'total_senders': 2, 'date_range': (Timestamp('2019-01-01 ...
# Get response times
response_times = analyzer.analyze_response_times()
print(response_times)
# mean median min max count
# ...
# Get language-related statistics
language_stats = analyzer.analyze_language_stats()
print(language_stats) # Shows word and emoji usage
>>> print(language_stats)
# {'total_words': 10000, 'unique_words': 4000, 'top_words': ...
# Analyze conversation lengths
conversation_stats = analyzer.analyze_conversation_length()
print(conversation_stats)
# start_time end_time duration messages participants
# conversation_id
# 0 2019-08-29 19:48:51 2019-08-29 20:25:38 36.783333 3 1
# 1 2019-09-02 14:18:55 2019-09-02 14:18:55 0.000000 1 1
# 2 2019-09-07 12:44:19 2019-09-07 12:44:19 0.000000 1 1
# ...
# If you want to save specific analyses to files
response_times.to_csv('response_times.csv')
conversation_stats.to_csv('conversation_lengths.csv')
- Ensure you have the required dependencies installed (pip install pandas matplotlib seaborn nltk wordcloud emoji)
- Note that Viber uses localized text for media files, such as photos. Currently, script supports English, Spanish, French, German, Ukrainian, Russian, Polish, and Portuguese languages.
- Some methods might require downloading additional NLTK resources, which is done automatically in the CLI. To download the necessary resources manually, run this code:
import nltk
nltk.download("punkt")
nltk.download("stopwords")
nltk.download("punkt_tab")
ViberChatStatAnalyzer is licensed under MIT.