Session Graph Understanding

Code and data for the intention graph construction pipeline and the recommendation experiments from the paper "Intention Knowledge Graph Construction for User Intention Relation Modeling" (arXiv:2412.11500), accepted by EACL 2026.

Citation

If you use this code or data, please cite:

@misc{bai2025intentionknowledgegraphconstruction,
      title={Intention Knowledge Graph Construction for User Intention Relation Modeling},
      author={Jiaxin Bai and Zhaobo Wang and Junfei Cheng and Dan Yu and Zerui Huang and Weiqi Wang and Xin Liu and Chen Luo and Yanming Zhu and Bo Li and Yangqiu Song},
      year={2025},
      eprint={2412.11500},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2412.11500},
}

Repository layout (high level)

rec_model/: session-based recommendation experiments.
generation_results/, annotation/, data_preprocess/: intention generation and annotation utilities.
prompting.py, gpt35_prompting.py, gpt4_prompting.py: LLM prompting code.
answer_process.py: post-processing to build intention graph outputs.

Graph construction (prompting)

The intention graph is built by prompting LLMs to generate intentions and then post-processing the outputs into structured triples.

Prompting for intentions:

prompting.py is a minimal runner using Azure OpenAI (see openai.api_base and openai.api_key).
Example invocation in prompting.py writes JSONL to generation_results/ (one JSON object per line with the session, prompt, and LLM answer).
gpt35_prompting.py / gpt4_prompting.py are alternative entry points.

Parsing and graph prep:

answer_process.py parses the raw answers into a triple file and can create intermediate artifacts (e.g., result_triple.txt and pickles for sessions).

VERA model runs (relation evaluation)

The VERA-based relation scoring scripts live in discourse_model/.

Example (single split):

cd discourse_model
CUDA_VISIBLE_DEVICES=0 python vera_evaluation.py -s 0

Notes:

vera_evaluation.py loads the VERA encoder from liujch1998/vera-base and expects a fine-tuned checkpoint at /data/jbai/cjf/Vera/vera_best_model.pth. Update that path if needed.
The input intention list is read from data_preprocess/generation_results/gpt-35-turbo_answer_<split>_intentions.json.
Convenience scripts vera_sampling_*.sh run different splits.

Recommendation experiments (`rec_model/`)

Data

The default dataset is rec_model/data/m2.txt.
Full raw data files are available here: Google Drive folder.
Format: one session per line as user_id item_id_1 item_id_2 ... item_id_n (space-separated integers).
rec_model/mat_m2_seqf.npz and rec_model/mat_m2.npz are sparse matrices used by the model code. Keep them in rec_model/ when running experiments.

Run

From the repo root:

cd rec_model
python main.py --data_dir ./data/ --data_name m2 --gpu_id 0

Key arguments in rec_model/main.py:

--max_seq_length: maximum session length (default: 20).
--hidden_size, --num_hidden_layers, --num_attention_heads: SASRec size.
--batch_size, --epochs, --lr: training hyperparameters.
--no_cuda: force CPU if needed.

Outputs

Logs are written to rec_model/logs_m2.txt.
Validation/test metrics are printed every epoch.

Notes

Splits are random with ratio 0.8/0.1/0.1 by default in rec_model/utils.py:get_user_seqs_split.
Training sessions are augmented by prefixing sequences of length >= 2.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
annotation		annotation
data		data
data_preprocess		data_preprocess
discourse_model		discourse_model
rec_model		rec_model
.gitignore		.gitignore
README.md		README.md
abstract_generate.py		abstract_generate.py
abstract_generate_api.py		abstract_generate_api.py
answer_process.py		answer_process.py
filter_data.py		filter_data.py
generation_result_add_intentions.py		generation_result_add_intentions.py
generation_result_compute_intention_embedding.py		generation_result_compute_intention_embedding.py
gpt35_prompting.py		gpt35_prompting.py
gpt4_prompting.py		gpt4_prompting.py
intention_concept_count.py		intention_concept_count.py
intention_relations_count.py		intention_relations_count.py
openai_commit_git.py		openai_commit_git.py
openai_test_cost_usage.py		openai_test_cost_usage.py
prompting.py		prompting.py
result_counting_conceputlized.py		result_counting_conceputlized.py
result_counting_discourse.py		result_counting_discourse.py
result_counting_intentions.py		result_counting_intentions.py
session_intention_count.py		session_intention_count.py
uniq_n_gram_counting.csv		uniq_n_gram_counting.csv
uniq_n_gram_counting.py		uniq_n_gram_counting.py
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Session Graph Understanding

Citation

Repository layout (high level)

Graph construction (prompting)

VERA model runs (relation evaluation)

Recommendation experiments (`rec_model/`)

Data

Run

Outputs

Notes

About

Uh oh!

Releases

Packages

Languages

HKUST-KnowComp/RelationalIntentionGraph

Folders and files

Latest commit

History

Repository files navigation

Session Graph Understanding

Citation

Repository layout (high level)

Graph construction (prompting)

VERA model runs (relation evaluation)

Recommendation experiments (rec_model/)

Data

Run

Outputs

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Recommendation experiments (`rec_model/`)

Packages