A welcome message and useful information from the organisers.
You will also find useful information on our website www.python-summit.ch/venue. Or feel free to ask any member of staff if you have a question.
Sometimes getting an approximate answer is super good enough. How do you check for duplicates, count unique users, or track item popularity when your dataset won’t fit in memory? Enter probabilistic data structures like Bloom filters, Count-Min Sketches, and HyperLogLog! This talk introduces these powerful tools, demonstrates simple implementations in Python, and gives you ideas on when to use them.
Walk away ready to apply these techniques in your own projects - no advanced math required.
Code review is a central part of everyday developer jobs. The motivation to create this talk was a quote:
“The most important superpower of a developer is complaining about everybody else's code.” In this talk, I’ll explain my approach to better code review.
Sometimes, it’s hard to convince a colleague about change or don’t change some lines of code. In my talk I would like to cover some best practices from my software engineering experience about efficient and honest code review. How to create a culture of perfect code review. How to apply automatic tools to improve code review routine of repetitive comments or suggestions. How to write/or reuse coding style guides for your team to reduce the time spent arguing about naming conventions and different styles.
What needs to be automated and what needs to be not automated during code review? The key role of patterns that can be reusable is not to confuse colleagues.
Bytecode, the internal language of instructions used by the interpreter is something that perhaps most Python developers have heard about, but few have dug into. This talk will try to explain the idea behind bytecode and how it works.
We will see how to extract bytecode from functions - with dis
module, and from .pyc
files (and what is the idea of __pycache__
directories). Then, the other way around: we’ll check the possibility of building new functions with raw bytes in runtime.
Why should you, as a Python developer, learn Rust? In this talk, we will explore Rust's compelling answers to this question. Rust offers guaranteed type safety and memory safety without a garbage collector, "fearless concurrency," and incredible performance. We will look into some of Rust's most distinctive features from a Python perspective, including its strict compiler, the ownership and borrowing system, null safety via Option, and explicit mutability. We will discover how Rust eliminates common runtime errors at compile time, and additionally, how understanding the concepts behind Rust's safety features can sharpen your Python skills, helping to write more robust and reliable code. By the end of the talk, you'll understand Rust's core value proposition and how its paradigms can benefit you, whether you are writing Python, Rust, or any other language.
Are you writing nested loops when solving coding challenges? Discover how Python's functional programming toolbox can transform your problem-solving approach.
We'll explore functional programming principles through the lens of Advent of Code puzzles, learning to think in streams of data rather than step-by-step instructions. We’ll explore some essential bits from itertools
, functools
, and operator
modules, aiming to write more expressive, debuggable code.
Starting with pure functions and lazy evaluation, we'll build up to solving real AoC problems using techniques like:
itertools.pairwise()
for sequence comparisonsfunctools.reduce()
for data aggregationoperator.itemgetter()
for elegant sorting- Generator expressions for memory-efficient processing
Through some puzzles from various years, we’ll see how functional approaches often lead to more concise solutions that closely mirror the problem description. We'll compare imperative vs functional solutions, highlighting pros and cons of both approaches.
Whether you're preparing for coding interviews, tackling AoC, or just want to expand your Python toolkit, you'll leave with a couple more ideas for writing cleaner, more Pythonic code—no external dependencies required.
In many parts of the world, especially across Africa, software cannot assume a stable internet connection. From rural communities to field agents working in transit or enforcement, the reality is simple: offline is the default, and sync is a luxury.
In this talk, we’ll explore how to build offline-first applications using Python — apps that work gracefully when the network doesn’t. Drawing from real-world civic and infrastructure projects scenarios in Nigeria, I’ll walk through techniques to queue, cache, and sync data locally, using tools like SQLite, Redis, Celery, and FastAPI. We’ll explore design patterns that prevent data loss, improve user experience, and simplify reconciliation once connectivity is restored.
Whether you’re building field data tools, mobile dashboards, or lightweight IoT integrations, this session will equip you with the mindset and technical building blocks to ensure your Python applications stay resilient — no matter the network conditions.
It has been known since the 70s that developers tend to give very optimistic estimations.
We prefer to have exact numbers, even if that means they are wrong most of the time.
In research, developers admitted that they believe their managers will see them as less competent if they provide estimates with huge margins. But mathematically speaking providing a wider min-max interval means you will be right more often.
Sp, maybe it isn’t really accuracy that businesses and people are looking for. Maybe estimates are needed for the sole purpose of risk aversion.
Risk can be measured in other ways.
Maybe it is time we stop estimating tasks left and right and instead start managing the project’s risk and customer’s expectations.
Over the years, the lack of an array data type in Python has resulted in the creation of numerous array libraries, each specializing in unique niches but still having some interoperability between each other. NumPy has become the de facto array library of Python, and the other array libraries try to keep their API close to that of NumPy. However, this often becomes infeasible, and the libraries deviate out of necessity. To make Python's array libraries shake hands with each other without any inconsistencies, the Consortium for Python Data API Standards has formalised an Array API standard for libraries offering array creation and manipulation operations.
The Array API standard allows users to write and use the same code for arrays belonging to any of the standard-conforming libraries. Through this talk, we will explore the need for such standardisation and discuss its salient features in detail. We will primarily delve into the example of using this standard to make specific parts of European Space Agency's Euclid space mission's code GPU and autodiff compatible. Besides cosmology, we will also take a look at a few other examples, mostly sourced from my experience working with and on several Python array libraries for scientific computing. Ultimately, the audience can expect to leave the room with the knowledge of both, the software engineering and the research side of the array API standard.
Our Lightning Talks are open to everyone 😊
How it works:
– You can register directly at the conference. First come, first served.
– Any proposal is welcome, as long as your talk has something to do with Python and respects our Code of Conduct. We reserve the right to reject talks.
– Talk time is strictly limited to 5 minutes.
– To keep turnaround times short, you will not be able to plug in your own device. We will provide a laptop with all slides. Please submit your slides as PDF via email at least 60 minutes before Lightning Talks start.
– By registering, you accept that your talk may be recorded, published and streamed live (audio & video) under Creative Commons Attribution 4.0 International license.
A thank you from the organisers. We hope you enjoyed your day!
(It won't take long, we promise! After that long day, everyone is looking forward to the buffet!)
A welcome message and useful information from the organisers.
You will also find useful information on our website www.python-summit.ch/venue. Or feel free to ask any member of staff if you have a question.
Demokratis.ch is a non-profit project working to modernise the consultation procedure—a key democratic process that allows Swiss citizens to provide feedback on proposed laws and amendments. Today, the process is slow and cumbersome for everyone involved: it requires studying lengthy PDFs, writing formal letters, and even synthesising legal arguments by copy-pasting into Excel. There’s a huge opportunity to streamline this process and make this democratic tool more accessible and inclusive.
In this talk, I’ll share how we’re tackling this challenge with machine learning: building data processing pipelines, extracting features from endless PDFs, embedding and classifying text, designing and evaluating models—and ultimately deploying them in production. Because the data comes from the federal administration and 26 different cantons, it’s often heterogeneous and in varying formats. Data quality, in general, presents many challenges for both training and evaluation. Spoiler: PDF is a pretty terrible format for machines…
Our approach is practical and pragmatic, and our code is open source, so you’re welcome to explore our solutions or even contribute yourself!
AI Agents are the next big thing everyone has been talking about. They are expected to revolutionize various industries by automating routine tasks, mission critical business workflows, enhancing productivity, and enabling humans to focus on creative and strategic work. Of course, you can apply them to your everyday coding tasks as well.
In this talk we’ll go over what those agents can bring to the table of coding world, and why they can deliver the promise of coding smarter that the current generation of coding assistants can’t. We will then dive right into a quick live coding session where I’ll show what such agents can do in real life and how you can start using them to enhance your everyday life already right after the talk. And we’ll finish off with some remarks on what the future of programming might look like in the near future as those agents get included into your everyday life.
Traditionally, marketing campaign analysis relies on simple metrics like the number of purchases made after a contact, or conversions following a promotion. While these numbers tell us what happened, they don’t reveal why it happened or if the campaign truly made a difference.
Such analysis can’t distinguish between customers who would have acted anyway and those who were genuinely influenced by the campaign. The key question is: did the campaign actually cause the desired effect?
In this practical and beginner-friendly session, we’ll explore how Causal Machine Learning provides the missing piece in campaign evaluation and targeting.
Starting from real-world scenarios, we’ll dive into:
- Why causality matters more than correlation when evaluating ad performance.
- How to estimate the true impact of a campaign using uplift modeling and treatment effect estimation in just a few lines of code.
- How to target users who are not just likely to interact with ads, but whose behavior can be influenced by the campaign (for example, to reduce churn or boost engagement).
The session will be hands-on with Python, with clear examples drawn from marketing applications.
Take-away:
Participants will gain a practical understanding of how to think causally in digital marketing, learning key techniques to measure impact and target campaigns more intelligently. moving from predictive to truly prescriptive analytics.
Docling is an open-source Python package that simplifies document processing by parsing diverse formats — including advanced PDF understanding — and integrating seamlessly with the generative AI ecosystem. It supports a wide range of input types such as PDFs, DOCX, XLSX, HTML, and images, offering rich parsing capabilities including reading order, table structure, code, and formulas. Docling provides a unified and expressive DoclingDocument format, enabling easy export to Markdown, HTML, and lossless JSON. It offers plug-and-play integrations with popular frameworks like LangChain, LlamaIndex, Crew AI, and Haystack, along with strong local execution support for sensitive data and air-gapped environments. As a Python package, Docling is pip-installable and comes with a clean, intuitive API for both programmatic and CLI-based workflows, making it easy to embed into any data pipeline or AI stack. Its modular design also supports extension and customization for enterprise use cases.
We also introduce SmolDocling, an ultra-compact 256M parameter vision-language model for end-to-end document conversion. SmolDocling generates a novel markup format called DocTags that captures the full content, structure, and spatial layout of a page, and offers accurate reproduction of document features such as tables, equations, charts, and code across a wide variety of formats — all while matching the performance of models up to 27× larger.
This talk will detail how to integrate external threat intelligence data into an autonomous agentic AI system for proactive cybersecurity. Using real world datasets—including open-source threat feeds, security logs, or OSINT—you will learn how to build a data ingestion pipeline, train models with Python, and deploy agents that autonomously detect and mitigate cyber threats. This case study will provide practical insights into data preprocessing, feature engineering, and the challenges of adversarial conditions.
Data is the fossil fuel of the machine learning world, essential for developing high quality models but in limited supply. Yet institutions handling sensitive documents — such as financial, medical, or legal records often cannot fully leverage their own data due to stringent privacy, compliance, and security requirements, making training high quality models difficult.
A promising solution is to replace the personally identifiable information (PII) with realistic synthetic stand-ins, whilst leaving the rest of the document in tact.
In this talk, we will discuss the use of open source tools and models that can be self hosted to anonymize documents. We will go over the various approaches for Named Entity Recognition (NER) to identify sensitive entities and the use of diffusion models to inpaint anonymized content.
Activation functions are fundamental elements of deep learning architectures as they significantly influence training dynamics. ReLU, while widely used, is prone to the dying neuron problem, which has been mitigated by variants such as LeakyReLU, PReLU, and ELU that better handle negative neuron outputs. Recently, self-gated activations like GELU and Swish have emerged as state-of-the-art alternatives, leveraging their smoothness to ensure stable gradient flow and prevent neuron inactivity.
In this work, we introduce the Gompertz Linear Unit (GoLU), a novel self-gated activation function defined as GoLU(x) = x Gompertz(x)
, where Gompertz(x) = exp(−exp(−x))
. The GoLU activation leverages the asymmetry in the Gompertz function to reduce variance in the latent space more effectively compared to GELU and Swish, while preserving robust gradient flow. Extensive experiments across diverse tasks, including Image Classification, Language Modeling, Semantic Segmentation, Object Detection, Instance Segmentation, and Diffusion, highlight GoLU's superior performance relative to state-of-the-art activation functions, establishing GoLU as a robust alternative to existing activation functions.
Networks are all around us, shaping phenomena like epidemics, communication, and transportation. In this talk, we will explore how real-world problems can be analyzed and solved using graph-based methods and simple algorithms. Drawing from examples such as trade networks, corporate structures, and historical data, I will demonstrate how network analysis reveals insights that would otherwise remain hidden. Using NetworKit (and NetworkX), we will analyze real-world datasets to answer questions like:
- What does the core-periphery model reveal about trade networks?
- Could we have predicted that Moscow will become Russia's capital?
- How do corporate hierarchies differ from interaction hierarchies within organizations?
Throughout the talk, I will introduce key concepts in network analysis and showcase Python as a tool for research. Attendees will have access to all datasets and code, enabling them to replicate the analyses and apply these techniques to their own projects. This session is designed for Python enthusiasts with an interest in data science, networks, and/or applied research.
Our Lightning Talks are open to everyone 😊
How it works:
– You can register directly at the conference. First come, first served.
– Any proposal is welcome, as long as your talk has something to do with Python and respects our Code of Conduct. We reserve the right to reject talks.
– Talk time is strictly limited to 5 minutes.
– To keep turnaround times short, you will not be able to plug in your own device. We will provide a laptop with all slides. Please submit your slides as PDF via email at least 60 minutes before Lightning Talks start.
– By registering, you accept that your talk may be recorded, published and streamed live (audio & video) under Creative Commons Attribution 4.0 International license.
A thank you from the organisers. We hope you enjoyed your day!
(It won't take long, we promise! After that long day, everyone is looking forward to the buffet!)