Feynman Innovations's picture

Feynman Innovations

ajibawa-2023

·

AjinkyaBawase

AI & ML interests

LLM, RL, DL, ML, AGI. Developing LLMs (preferably fully fine tuned ) for various use cases.

Recent Activity

reacted to their post with 👍 about 5 hours ago

Cpp-Code-Large Dataset: https://huggingface.co/datasets/ajibawa-2023/Cpp-Code-Large Cpp-Code-Large is a large-scale corpus of C++ source code comprising more than 5 million lines of C++ code. The dataset is designed to support research in large language model (LLM) pretraining, code intelligence, software engineering automation, and static program analysis for the C++ ecosystem. By providing a high-volume, language-specific corpus, Cpp-Code-Large enables systematic experimentation in C++-focused model training, domain adaptation, and downstream code understanding tasks. Cpp-Code-Large addresses the need for a dedicated C++-only dataset at substantial scale, enabling focused research across systems programming, performance-critical applications, embedded systems, game engines, and large-scale native software projects.

reacted to their post with 🚀 about 5 hours ago

Cpp-Code-Large Dataset: https://huggingface.co/datasets/ajibawa-2023/Cpp-Code-Large Cpp-Code-Large is a large-scale corpus of C++ source code comprising more than 5 million lines of C++ code. The dataset is designed to support research in large language model (LLM) pretraining, code intelligence, software engineering automation, and static program analysis for the C++ ecosystem. By providing a high-volume, language-specific corpus, Cpp-Code-Large enables systematic experimentation in C++-focused model training, domain adaptation, and downstream code understanding tasks. Cpp-Code-Large addresses the need for a dedicated C++-only dataset at substantial scale, enabling focused research across systems programming, performance-critical applications, embedded systems, game engines, and large-scale native software projects.

reacted to their post with 🔥 about 5 hours ago

Cpp-Code-Large Dataset: https://huggingface.co/datasets/ajibawa-2023/Cpp-Code-Large Cpp-Code-Large is a large-scale corpus of C++ source code comprising more than 5 million lines of C++ code. The dataset is designed to support research in large language model (LLM) pretraining, code intelligence, software engineering automation, and static program analysis for the C++ ecosystem. By providing a high-volume, language-specific corpus, Cpp-Code-Large enables systematic experimentation in C++-focused model training, domain adaptation, and downstream code understanding tasks. Cpp-Code-Large addresses the need for a dedicated C++-only dataset at substantial scale, enabling focused research across systems programming, performance-critical applications, embedded systems, game engines, and large-scale native software projects.

View all activity

Organizations

ajibawa-2023 's models 32

ajibawa-2023/Python-Code-13B

Text Generation • 13B • Updated Nov 9, 2024 • 818 • 7

ajibawa-2023/Young-Children-Storyteller-Mistral-7B

Text Generation • 7B • Updated Jun 26, 2024 • 71 • 23

ajibawa-2023/SlimOrca-Llama-3-8B

Text Generation • 8B • Updated May 27, 2024 • 8 • • 4

ajibawa-2023/Code-Llama-3-8B

Text Generation • 8B • Updated May 8, 2024 • 97 • 31

ajibawa-2023/Uncensored-Frank-Llama-3-8B

Text Generation • 8B • Updated May 8, 2024 • 14 • • 13

ajibawa-2023/Scarlett-Llama-3-8B-v1.0

Text Generation • Updated May 7, 2024 • 2 • 5

ajibawa-2023/Scarlett-Llama-3-8B

Text Generation • Updated Apr 26, 2024 • 5 • 8

ajibawa-2023/Code-Mistral-7B

Text Generation • 7B • Updated Apr 26, 2024 • 68 • 15

ajibawa-2023/General-Stories-Mistral-7B

Text Generation • Updated Apr 23, 2024 • 11 • 5

ajibawa-2023/Code-Jamba-v0.1

Text Generation • 52B • Updated Apr 12, 2024 • 16 • 7

ajibawa-2023/WikiHow-Mistral-Instruct-7B

Text Generation • 7B • Updated Apr 7, 2024 • 69 • 7

ajibawa-2023/Uncensored-Jordan-13B

Text Generation • 13B • Updated Apr 4, 2024 • 833 • 7

ajibawa-2023/OpenHermes-2.5-Code-290k-13B

Text Generation • 13B • Updated Mar 17, 2024 • 67 • 11

ajibawa-2023/Code-290k-6.7B-Instruct

Text Generation • 7B • Updated Mar 4, 2024 • 66 • 6

ajibawa-2023/Code-13B

Text Generation • Updated Mar 4, 2024 • 785 • 13

ajibawa-2023/SlimOrca-13B

Text Generation • Updated Mar 4, 2024 • 781 • 11

ajibawa-2023/Python-Code-33B

Text Generation • Updated Mar 4, 2024 • 790 • 8

ajibawa-2023/Code-290k-13B

Text Generation • Updated Mar 4, 2024 • 723 • 8

ajibawa-2023/scarlett-33b

Text Generation • Updated Feb 28, 2024 • 785 • 25

ajibawa-2023/Code-33B

Text Generation • Updated Dec 13, 2023 • 788 • 7

ajibawa-2023/Uncensored-Jordan-7B

Text Generation • Updated Nov 20, 2023 • 790 • 5

ajibawa-2023/Uncensored-Jordan-33B

Text Generation • Updated Nov 18, 2023 • 791 • 7

ajibawa-2023/carl-33b

Text Generation • Updated Nov 18, 2023 • 788 • 10

ajibawa-2023/Uncensored-Frank-33B

Text Generation • Updated Nov 18, 2023 • 791 • 7

ajibawa-2023/Uncensored-Frank-13B

Text Generation • Updated Nov 18, 2023 • 795 • 8

ajibawa-2023/scarlett-7b

Text Generation • Updated Nov 18, 2023 • 781 • 4

ajibawa-2023/Uncensored-Frank-7B

Text Generation • Updated Nov 18, 2023 • 816 • 5

ajibawa-2023/carl-7b

Text Generation • Updated Nov 18, 2023 • 779 • 6

ajibawa-2023/Scarlett-Phi

Text Generation • Updated Oct 10, 2023 • 2 • 8

ajibawa-2023/carl-13b

Text Generation • Updated Aug 16, 2023 • 781 • 6